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N-TUPLE OR RAM BASED NEURAL NETWORK CLASSIFICATION SYSTEM AND 
METHOD 

BACKGROUND OF THE INVENTION 

5 

1 . Field of the Invention 

The present invention relates generally to n-tuple or RAM based neural network classi- 
fication systems and, more particularly, to n-tuple or RAM based classification systems 
10 where the decision criteria applied to obtain the output scores and compare these out- 
put scores to obtain a classification are determined during a training process. 

2. Description of the Prior Art 

15 A known way of classifying objects or patterns represented by electric signals or binary 
codes and, more precisely, by vectors of signals applied to the inputs of neural network 
classification systems lies in the implementation of a so-called learning or training 
phase. This phase generally consists of the configuration of a classification network that 
fulfils a function of performing the envisaged classification as efficiently as possible by 

20 using one or more sets of signals, called learning or training sets, where the member- 
ship of each of these signals in one of the classes in which it is desired to classify them 
is known. This method is known as supervised learning or learning with a teacher. 

A subclass of classification networks using supervised learning are networks using 
25 memory-based learning. Here, one of the oldest memory-based networks is the "n-tuple 
network" proposed by Bledsoe and Browning (Bledsoe, W.W. and Browning, I, 1959, 
"Pattern recognition and reading by machine", Proceedings of the Eastern Joint Com- 
puter Conference, pp. 225-232) and more recently described by Morciniec and Rohwer 
(Morciniec, M. and Rohwer, R.,1996, "A theoretical and experimental account of n- 
30 tuple classifier performance", Neural Comp., pp. 629-642). 

One of the benefits of such a memory-based system is a very fast computation time, 
both during the learning phase and during classification. For the known types of n-tuple 
networks, which is also known as "RAM networks" or "weightless neural networks", 
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learning may be accomplished by recording features of patterns in a random-access 
memory (RAM), which requires just one presentation of the training set(s) to the system. 

The training procedure for a conventional RAM based neural network is described by 
5 Jorgensen (co-inventor of this invention) et al. in a contribution to a recent book on 

RAM based neural networks (T.M. Jorgensen, S.S. Christensen, and C. Liisberg, "Cross- 
validation and information measures for RAM based neural networks/ RAM-based 
neural networks, J. Austin, ed., World Scientific, London, pp. 78-88, 1998). The contri- 
bution describes how the RAM based neural network may be considered as comprising 

10 a number of Look Up Tables (LUTs). Each LUT may probe a subset of a binary input 

data vector. In the conventional scheme the bits to be used are selected at random. The 
sampled bit sequence is used to construct an address. This address corresponds to a 
specific entry (column) in the LUT. The number of rows in the LUT corresponds to the 
number of possible classes. For each class the output can take on the values 0 or 1 . A 

15 value of 1 corresponds to a vote on that specific class. When performing a classifica- 
tion, an input vector is sampled, the output vectors from all LUTs are added, and sub- 
sequently a winner takes all decision is made to classify the input vector. In order to 
perform a simple training of the network, the output values may initially be set to 0. For 
each example in the training set, the following steps should then be carried out: 

20 

Present the input vector and the target class to the network, for all LUTs calculate their 
corresponding column entries, and set the output value of the target class to 1 in all the 
"active" columns. 

25 By use of such a training strategy it may be guaranteed that each training pattern always 
obtains the maximum number of votes on the true class. As a result such a network 
makes no misclassification on the training set, but ambiguous decisions may occur. 
Here, the generalisation capability of the network is directly related to the number of 
input bits for each LUT. If a LUT samples all input bits then it will act as a pure memory 

30 device and no generalisation will be provided. As the number of input bits is reduced 
the generalisation is increased at an expense of an increasing number of ambiguous 
decisions. Furthermore, the classification and generalisation performances of a LUT are 
highly dependent on the actual subset of input bits probed. The purpose of an "intelli- 
gent" training procedure is thus to select the most appropriate subsets of input data. 
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Jorgensen et al. further describes what is named a 'leave-one-out cross-validation test" 
which suggests a method for selecting an optimal number of input connections to use 
per LUT in order to obtain a low classification error rate with a short overall computa- 
5 tion time. In order to perform such a cross-validation test it is necessary to obtain a 

knowledge of the actual number of training examples that have visited or addressed the 
cell or element corresponding to the addressed column and class. It is therefore sug- 
gested that these numbers are stored in the LUTs. It is also suggested by j0rgensen et al. 
how the LUTs in the network can be selected in a more optimum way by successively 
10 training new sets of LUTs and performing cross validation test on each LUT. Thus, it is 
known to have a RAM network in which the LUTs are selected by presenting the train- 
ing set to the system several times. 

The output vector from the RAM network contains a number of output scores, one for 
15 each possible class. As mentioned above a decision is normally made by classifying an 
example in to the class having the largest output score. This simple winner-takes-all 
(WTA) scheme assures that the true class of a training examples cannot lose to one of 
the other classes. One problem with the RAM net classification scheme is that it often 
behaves poorly when trained on a training set where the distribution of examples be- 
20 tween the training classes are highly skewed. Accordingly there is a need for under- 
standing the influence of the composition of the training material on the behaviour of 
the RAM classification system as well as a general understanding of the influence of 
specific parameters of the architecture on the performance. From such an understand- 
ing it could be possible to modify the classification scheme to improve its performance 
25 and competitiveness with other schemes. Such improvements of the RAM based classi- 
fication systems is provided according to the present invention. 

SUMMARY OF THE INVENTION 

30 Recently Thomas Martini j0rgensen and Christian Linneberg (inventors of this inven- 
tion) have provided a statistical framework that have made it possible to make a theo- 
retical analysis that relates the expected output scores of the n-tuple net to the stochas- 
tic parameters of the example distributions, the number of available training examples, 
and the number of address lines n used for each LUT or n-tuple. From the obtained 
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expressions, they have been able to study the behaviour of the architecture in different 
scenarios. Furthermore, they have based on the theoretical results come up with pro- 
posals for modifying the n-tuple classification scheme in order to make it operate as a 
close approximation to the maximum a posteriori or maximum likelihood estimator. 
5 The resulting modified decision criteria can for example deal with the so-called skewed 
class prior problem causing the n-tuple net to often behave poorly when trained on a 
training set where the distribution of examples between the training classes are highly 
skewed. Accordingly the proposed changes of the classification scheme provides an 
essential improvement of the architecture. The suggested changes in decision criteria 
10 are not only applicable to the original n-tuple architecture based on random memorisa- 
tion. It also applies to extended n-tuple schemes, some of which use a more optimal 
selection of the address lines and some of which apply an extended weight scheme. 

According to a first aspect of the present invention there is provided a method for train- 
15 ing a computer classification system which can be defined by a network comprising a 
number of n-tuples or Look Up Tables (LUTs), with each n-tuple or LUT comprising a 
number of rows corresponding to at least a subset of possible classes and further com- 
prising a number of columns being addressed by signals or elements of sampled train- 
ing input data examples, each column being defined by a vector having cells with val- 
20 ues, said method comprising determining the column vector cell values based on one 
or more training sets of input data examples for different classes so that at least part of 
the cells comprise or point to information based on the number of times the corre- 
sponding cell address is sampled from one or more sets of training input examples. The 
method further comprises determining one or more output score functions for evalua- 
25 tion of at least one output score value per class, and/or determining one or more deci- 
sion rules to be used in combination with at least part of the obtained output score val- 
ues to determine a winning class. 

It is preferred that the output score values are evaluated or determined based on the 
30 information of at least part of the determined column vector cell values. 

According to the present invention it is preferred that the output score functions and/or 
the decision rules are determined based on the information of at least part of the de- 
termined column vector cell values. 
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It is also preferred to determine the output score functions from a family of output score 
functions determined by a set of parameter values. Thus, the output score functions 
may be determined either from the set of parameter values, from the information of at 
5 least part of the determined column vector cell values or from both the set of parameter 
values and the information of at least part of the determined column vector cell values. 

It should be understood that the training procedure of the present invention may be 
considered a two step training procedure. The first step may comprise determining the 
10 column vector cell values, while the second step may comprise determining the output 
score functions and/or the decision rules. 

As already mentioned, the column vector cells are determined based on one or more 
training sets of input data examples of known classes, but the output score functions 
15 and/or the decision rules may be determined based on a validation set of input data 

examples of known classes. Here the validation set may be equal to or part of the train- 
ing set(s), but the validation set may also be a set of examples not included in the train- 
ing set(s). 

20 According to the present invention the training and/or validation input data examples 
may preferably be presented to the network as input signal vectors. 

It is preferred that determination of the output score functions is performed so as to 
allow different ways of using the contents of the column vector cells in calculating the 
25 output scores used to find the winning class amongst two or more classes. The way the 
contents of the column vector cells are used to obtain the score of one class might de- 
pend on which class(es) it is compared with. 

It is also preferred that the decision rules used when comparing two or more classes in 
30 the output space are allowed to deviate from the decision rules corresponding to a 

WTA decision. Changing the decision rules for choosing two or more classes is equiva- 
lent to allowing individual transformation of the class output scores and keeping a 
WTA comparison. These corresponding transformations might depend on which 
class(es) a given class is compared with. 
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The determination of how the output score functions may be calculated from the col- 
umn vector cell values, as well as the determination of how many output score func- 
tions to use and/or the determination of the decision rules to be applied on the output 
5 score values may comprise the initialisation of one or more sets of output score func- 
tions and/or decision rules. 

Furthermore it is preferred to adjust at least part of the output score functions and/or the 
decision rules based on an information measure evaluating the performance on the 
10 validation example set. If the validation set equals the training set or part of the training 
set it is preferred to use a leave-one-out cross-validation evaluation or extensions of this 
concept. 

In order to determine or adjust the output score functions and the decision rules ac- 
15 cording to the present invention, the column cell values should be determined. Here, it 
is preferred that at least part of the column cell values are determined as a function of 
the number of times the corresponding cell address is sampled from the set(s) of train- 
ing input examples. Alternatively, the information of the column cells may be deter- 
mined so that the maximum column cell value is 1 , but at least part of the cells have an 
20 associated value being a function of the number of times the corresponding cell ad- 
dress is sampled from the training set(s) of input examples. Preferably, the column vec- 
tor cell values are determined and stored in storing means before the determination or 
adjustment of the output score functions and/or the decision rules. 

25 According to the present invention, a preferred way of determining the column vector 
cell values may comprise the training steps of 

a) applying a training input data example of a known class to the classifica- 
tion network, thereby addressing one or more column vectors, 

b) incrementing, preferably by one, the value or vote of the cells of the ad- 
30 dressed column vector(s) corresponding to the row(s) of the known class, 

and 

c) repeating steps (a)-(b) until all training examples have been applied to the 
network. 
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However, it should be understood that the present invention also covers embodiments 
where the information of the column cells is determined by alternative functions of the 
number of times the cell has been addressed by the input training set(s). Thus, the cell 
information does not need to comprise a count of all the times the cell has been ad- 
5 dressed, but may for example comprise an indication of when the cell has been visited 
zero times, once, more than once, and/or twice and more than twice and so on. 

In order to determine the output score functions and/or the decision rules, it is pre- 
ferred to adjust these output score functions and/or decision rules, which adjustment 

10 process may comprise one or more iteration steps. The adjustment of the output score 
functions and/or the decision rules may comprise the steps of determining a global 
quality value based on at least part of the column vector cell values, determining if the 
global quality value fulfils a required quality criterion, and adjusting at least part of 
output score functions and/or part of the decision rules until the global quality criterion 

15 is fulfilled. 



The adjustment process may also include determination of a local quality value for 
each sampled validation input example, with one or more adjustments being per- 
formed if the local quality value does not fulfil a specified or required local quality cri- 
20 terion for the selected input example. As an example the adjustment of the output score 
functions and/or the decision rules may comprise the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation input 
example, the local quality value being a function of at least part of the ad- 

25 dressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, adjusting one or more of the output score functions and/or decision rules if 
the local quality criterion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
30 the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined validation 
input examples, 

0 determining a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 
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g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 

5 Preferably, steps (b)-(d) of the above mentioned adjustment process may be carried out 
for all examples of the validation set(s). 

The local and/or global quality value may be defined as functions of at least part of the 
column cells. 

10 

It should be understood that when adjusting the output score functions and/or decision 
rules by use of one or more quality values each with a corresponding quality criterion, 
it may be preferred to stop the adjustment iteration process if a quality criterion is not 
fulfilled after a given number of iterations. 

15 

It should also be understood that during the adjustment process the adjusted output 
score functions and/or decision rules are preferably stored after each adjustment, and 
when the adjustment process includes the determination of a global quality value, the 
step of determination of the global quality value may further be followed by separately 
20 storing the hereby obtained output score functions and/or decision rules or classifica- 
tion system configuration values if the determined global quality value is closer to fulfil 
the global quality criterion than the global quality value corresponding to previously 
separately stored output score functions and/or decision rules or configuration values. 

25 A main reason for training a classification system according to an embodiment of the 
present invention is to obtain a high confidence in a subsequent classification process 
of an input example of an unknown class. 

Thus, according to a further aspect of the present invention, there is also provided a 
30 method of classifying input data examples into at least one of a plurality of classes us- 
ing a computer classification system configured according to any of the above de- 
scribed methods of the present invention, whereby column cell values for each n-tuple 
or LUT and output score functions and/or decision rules are determined using on one 
or more training or validation sets of input data examples, said method comprising 
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a) applying an input data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n-tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of output 
score functions and decision rules thereby addressing specific rows in the set of 
n-tuples or LUTs, 

c) determining output score values as a function of the column vector cells and us- 
ing the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, and 

e) selecting the class or classes that win(s) according to the decision rules. 

The present invention also provides training and classification systems according to the 
above described methods of training and classification. 



Thus, according to the present invention there is provided a system for training a com- 
15 puter classification system which can be defined by a network comprising a stored 

number of n-tuples or Look Up Tables (LUTs), with each n-tuple or LUT comprising a 
number of rows corresponding to at least a subset of possible classes and further com- 
prising a number of columns being addressed by signals or elements of sampled train- 
ing input data examples, each column being defined by a vector having cells with val- 
20 ues, said system comprising 

• input means for receiving training input data examples of known classes, 

• means for sampling the received input data examples and addressing column vec- 
tors in the stored set of n-tuples or LUTs, 

• means for addressing specific rows in the set of n-tuples or LUTs, said rows corre- 
25 sponding to a known class, 

• storage means for storing determined n-tuples or LUTs, 

• means for determining column vector cell values so as to comprise or point to in- 
formation based on the number of times the corresponding cell address is sampled 
from the training set(s) of input examples, and 

30 • means for determining one or more output score functions and/or one or more de- 
cision rules. 
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Here, it is preferred that the means for determining the output score functions and/or 
decision rules is adapted to determine these functions and/or rules based on the infor- 
mation of at least part of the determined column vector cell values. 

5 The means for determining the output score functions may be adapted to determine 

such functions from a family of output score functions determined by a set of parameter 
values. Thus, the means for determining the output score functions may be adapted to 
determine such functions either from the set of parameter values, from the information 
of at least part of the determined column vector cell values or from both the set of pa- 
10 rameter values and the information of at least part of the determined column vector cell 
values. 



According to the present invention the means for determining the output score func- 
tions and/or the decision rules may be adapted to determine such functions and/or 
15 rules based on a validation set of input data examples of known classes. Here the vali- 
dation set may be equal to or part of the training set(s) used for determining the column 
cell values, but the validation set may also be a set of examples not included in the 
training set(s). 

20 In order to determine the output score functions and decision rules according to a pre- 
ferred embodiment of the present invention, the means for determining the output 
score functions and decision rules may comprise 

means for initialising one or more sets output score functions and/or decision 
rules, and 

25 means for adjusting output score functions and decision rules by use of at least 

part of the validation set of input examples. 

As already discussed above the column cell values should be determined in order to 
determine the output score functions and decision rules. Here, it is preferred that the 
30 means for determining the column vector cell values is adapted to determine these 

values as a function of the number of times the corresponding cell address is sampled 
from the set(s) of training input examples. Alternatively, the means for determining the 
column vector cell values may be adapted to determine these cell values so that the 
maximum value is 1 , but at least part of the cells have an associated value being a 
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function of the number of times the corresponding cell address is sampled from the 
training set(s) of input examples. 



According to an embodiment of the present invention it is preferred that when a train- 
ing input data example belonging to a known class is applied to the classification net- 
work thereby addressing one or more column vectors, the means for determining the 
column vector cell values is adapted to increment the value or vote of the cells of the 
addressed column vector(s) corresponding to the row(s) of the known class, said value 
preferably being incremented by one. 



For the adjustment process of the output score functions and decision rules it is pre- 
ferred that the means for adjusting output score functions and/or decision rules is 
adapted to 

determine a global quality value based on at least part of column vector cell 
15 values, 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust at least part of the output score functions and/or decision rules until the 
global quality criterion is fulfilled. 

20 

As an example of a preferred embodiment according to the present invention, the 
means for adjusting output score functions and decision rules may be adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 

25 dressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) repeat the local quality test for a predetermined number of training input exam- 
30 pies, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 
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g) repeat the local and the global quality test until the global quality criterion is 
fulfilled. 

The means for adjusting the output score functions and decision rules may further be 
5 adapted to stop the iteration process if the global quality criterion is not fulfilled after a 
given number of iterations. In a preferred embodiment, the means for storing n-tuples 
or LUTs comprises means for storing adjusted output score functions and decision rules 
and separate means for storing best so far output score functions and decision rules or 
best so far classification system configuration values. Here, the means for adjusting the 

10 output score functions and decision rules may further be adapted to replace previously 
separately stored best so far output score functions and decision rules with obtained 
adjusted output score functions and decision rules if the determined global quality 
value is closer to fulfil the global quality criterion than the global quality value corre- 
sponding to previously separately stored best so far output score functions and decision 

15 rules. Thus, even if the system should not be able to fulfil the global quality criterion 
within a given number of iterations, the system may always comprise the "best so far" 
system configuration. 

According to a further aspect of the present invention there is also provided a system 
20 for classifying input data examples of unknown classes into at least one of a plurality of 
classes, said system comprising: 

storage means for storing a number or set of n-tuples or Look Up Tables (LUTs) 
with each n-tuple or LUT comprising a number of rows corresponding to at 
least a subset of the number of possible classes and further comprising a num- 
25 ber of column vectors, each column vector being addressed by signals or ele- 

ments of a sampled input data example, and each column vector having cell 
values being determined during a training process based on one or more sets of 
training input data examples, 

storage means for storing one ore more output score functions and/or one or 
30 more decision rules, each output score function and/or decision rule being de- 

termined during a training or validation process based on one or more sets of 
validation input data examples, said system further comprising: 
input means for receiving an input data example to be classified, 
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means for sampling the received input data example and addressing column 
vectors in the stored set of n-tuples or LUTs, 

means for addressing specific rows in the set of n-tuples or LUTs, said rows cor- 
responding to a specific class, 

means for determining output score values using the stored output score func- 
tions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score val- 
ues and stored decision rules. 

It should be understood that it is preferred that the cell values of the column vectors 
and the output score functions and/or decision rules of the classification system accord- 
ing to the present invention are determined by use of a training system according to 
any of the above described systems. Accordingly, the column vector cell values and the 
output score functions and/or decision rules may be determined during a training proc- 
ess according to any of the above described methods. 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the present invention and in order to show how the same 
may be carried into effect, reference will now be made by way of example to the ac- 
companying drawings in which: 

Fig. 1 shows a block diagram of a RAM classification network with Look Up Tables 
(LUTs), 

Fig. 2 shows a detailed block diagram of a single Look Up Table (LUT) according to an 
embodiment of the present invention, 

Fig. 3 shows a block diagram of a computer classification system according to the pres- 
ent invention, 

Fig. 4 shows a flow chart of a learning process for LUT column cells according to an 
embodiment of the present invention, 



SUBSTITUTE SHEET (RULE 26) 



WO 99/67694 




PCT/DK99/00340 



14 

Fig. 5 shows a flow chart of a learning process according to a embodiment of the pres- 
ent invention, 

Fig. 6 shows a flow chart of a classification process according to the present invention. 

5 

DETAILED DESCRIPTION OF THE INVENTION 

In the following a more detailed description of the architecture and concept of a classi- 
fication system according to the present invention will be given including an example 
10 of a training process of the column cells of the architecture and an example of a classi- 
fication process. Furthermore, different examples of learning processes for the output 
score functions and the decision rules according to embodiments of the present inven- 
tion are described. 

Notation 

15 

The notation used in the following description and examples is as follows: 



X: The training set 

x : An example from the training set. 

20 N x : Number of examples in the training set X . 

Xj\ The j'th example from a given ordering of the training set X . 

y: A specific example (possible outside the training set). 

C: Class label. 

C(x): Class label corresponding to example x (the true class). 

25 C w \ Winner Class obtained by classification. 

C T : True class obtained by classification. 

N c : Number of training classes corresponding to the maximum number of 
rows in a LUT. 

Q: Set of LUTs (each LUT may contain only a subset of all possible address 

30 columns, and the different columns may register only subsets of the ex- 
isting classes). 

N trrr : Number of LUTs. 
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N COL : Number of different columns that can be addressed in a specific LUT 

(LUT dependent). 

X c : The set of training examples labelled class C. 

v, r : Entry counter for the cell addressed by the i'th column and the C'th class, 

a, (JO: Index of the column in the i'th LUT being addressed by example y . 

v : Vector containing all v iC elements of the LUT network. 

q l \ Local quality function. 

Q G : Global quality function. 

B CiXj : Decision rule matrix 

M„ , : Cost matrix 

S - : Score function 

r . : Leave-one-out cross-validation score function 

P: Path matrix 

/?: Parameter vector 

E: Set of decision rules 

d c : Score value on class c 

D( ): Decision function 

Description of architecture and concept 



In the following references are made to Fig. 1, which shows a block diagram of a RAM 
classification network with Look Up Tables (LUTs), and Fig. 2, which shows a detailed 
block diagram of a single Look Up Table (LUT) according to an embodiment of the 
present invention. 

A RAM-net or LUT-net consists of a number of Look Up Tables (LUTs) (1 .3). Let the 
number of LUTs be denoted N LUT , An example of an input data vector y to be classi- 
fied may be presented to an input module (1.1) of the LUT network. Each LUT may 
sample a part of the input data, where different numbers of input signals may be sam- 
pled for different LUTs (1 .2) (in principle it is also possible to have one LUT sampling 
the whole input space). The outputs of the LUTs may be fed (1 .4) to an output module 
(1.5) of the RAM classification network. 
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In Fig. 2 it is shown that for each LUT the sampled input data (2.1) of the example pre- 
sented to the LUT-net may be fed into an address selecting module (2.2). The address 
selecting module (2.2) may from the input data calculate the address of one or more 
specific columns (2.3) in the LUT. As an example, let the index of the column in the 
i'th LUT being addressed by an input example^ be calculated as tf f -(y). The number of 
addressable columns in a specific LUT may be denoted N COL , and varies in general 
from one LUT to another. The information stored in a specific row of a LUT may corre- 
spond to a specific class C (2.4). The maximum number of rows may then correspond 
to the number of classes, N c . The number of cells within a column corresponds to the 
number of rows within the LUT. The column vector cells may correspond to class spe- 
cific entry counters of the column in question. The entry counter value for the cell ad- 
dressed by the i'th column and class C is denoted v iC (2.5). 

The v iC -values of the activated LUT columns (2.6) may be fed (1 .4) to the output mod- 
ule (1 .5), where one or more output scores may be calculated for each class and where 
these output scores in combinations with a number of decision rules determine the 
winning class. 

Let x <=iX denote an input data example used for training and let y denote an input data 
example not belonging to the training set. Let C(x) denote the class to which x be- 
longs. The class assignment given to the example y is then obtained by calculating one 
or more output scores for each class. The output scores obtained for class C is calcu- 
lated as functions of the v /c numbers addressed by the example y but will in general 
also depend on a number of parameters 0. Let the m* output score of class C be de- 
noted ^(v,^). A classification is obtained by combining the obtained output scores 
from ail classes with a number of decision rules. The effect of the decision rules is to 
define regions in the output score space that must be addressed by the output score 
values to obtain a given winner class. The set of decision rules is denoted E and corre- 
sponds to a set of decision borders. 
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Figure 3 shows an example of a block diagram of a computer classification system ac- 
cording to the present invention. Here a source such as a video camera or a database 
provides an input data signal or signals (3.0) describing the example to be classified. 
These data are fed to a pre-processing module (3.1) of a type which can extract fea- 
5 tures, reduce, and transform the input data in a predetermined manner. An example of 
such a pre-processing module is a FFT-board (Fast Fourier Transform). The transformed 
data are then fed to a classification unit (3.2) comprising a RAM network according to 
the present invention. The classification unit (3.2) outputs a ranked classification list 
which might have associated confidences. The classification unit can be implemented 

10 by using software to programme a standard Personal Computer or programming a 
hardware device, e.g. using programmable gate arrays combined with RAM circuits 
and a digital signal processor. These data can be interpreted in a post-processing device 
(3.3), which could be a computer module combining the obtained classifications with 
other relevant information. Finally the result of this interpretation is fed to an output 

15 device (3.4) such as an actuator. 

Initial training of the architecture 



The flow chart of Fig. 4 illustrates a one pass learning scheme or process for the deter- 
mination of the column vector entry counter or cell distribution, v /c -distribution (4.0), 
20 according to an embodiment of the present invention, which may be described as fol- 
lows: 



1 . Initialise all entry counters or column vector cells by setting the cell values, v, 
to zero (4.1). 

25 2. Present the first training input example, 3c, from the training set X to the net- 
work (4.2, 4.3). 

3. Calculate the columns addressed for the first LUT (4.4, 4.5). 

4. Add 1 to the entry counters in the rows of the addressed columns that corre- 
spond to the class label of x (increment v 0 {i)C{i) in all LUTs) (4.6). 

30 5. Repeat step 4 for the remaining LUTs (4.7, 4.8). 

6. Repeat steps 3-5 for the remaining training input examples (4.9, 4.10). The 
number of training examples is denoted N x . 
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Initialisation of output score functions and decision rules 

Before the trained network can be used for classification the output score functions and 
the decision rules must be initialised. 

5 Classification of an unknown input example 

When the RAM network of the present invention has been trained to thereby determine 
values for the column cells whereby the LUTs may be defined, the network may be 
used for classifying an unknown input data example. 

10 In a preferred example according to the present invention, the classification is per- 
formed by using the decision rules E and the output scores obtained from the output 
score functions. Let the decision function invoking E and the output scores be denoted 
D( ). The winning class can then be written as: 

Winner Class =D(E, S ( j ,S } 2 ,...S, j...S 3 , S 2>k ,...S, iin ) 

15 Figure 6 shows a block diagram of the operation of a computer classification system in 
which a classification process (6.0) is performed. The system acquires one or more in- 
put signals (6.1) using e.g. an optical sensor system. The obtained input data are pre- 
processed (6.2) in a pre-processing module, e.g. a low-pass filter, and presented to a 
classification module (6.3) which according to an embodiment of the invention may be 

20 a LUT-network. The output data from the classification module is then post-processed 
in a post-processing module (6.4), e.g. a CRC algorithm calculating a cyclic redun- 
dancy check sum, and the result is forwarded to an output device (6.5), which could be 
a monitor screen. 

Adjustment of output score function parameter 0 and adjustment of decision rules E 

25 

Usually the initially determined values of /? and the initial set of rules E will not pres- 
ent the optimal choices. Thus, according to a preferred embodiment of the present in- 
vention, an optimisation or adjustment of the /? values and the E rules should be per- 
formed. 
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In order to select or adjust the parameters /? and the rules 5 to improve the perform- 
ance of the classification system, it is suggested according to an embodiment of the 
invention to define proper quality functions for measuring the performance of the p- 
values and the E- rules. Thus, a local quality function e^(v,*,^ s /?,E) may be defined, 
where v denotes a vector containing all v, c elements of the LUT network. The local 
quality function may give a confidence measure of the output classification of a specific 
example jc . If the quality value does not satisfy a given criterion the fi values and the E 
rules are adjusted to make the quality value satisfy or closer to satisfying the criterion (if 
possible). 

Furthermore a global quality function: Q c {v>X,P,=) may be defined. The global quality 
function may measure the performance of the input training set as a whole. 

Fig. 5 shows a flow chart for adjustment or learning of the 0 values and the E rules 
according to the present invention. 



This example illustrates an optimisation procedure for adjusting the decision rules E. 
We consider /^training classes. The class label c is an integer running from 1 to N c . 
For each class c we define a single output score function: 



Example 1 



w here 5 i} is Kroneckers delta {8 i} = 1 if i = y and 0 otherwise), and 




SUBSTITUTE SHEET (RULE 26) 



WO 99/67694 



PCT/DK99/00340 



20 



The expression for the output score function illustrates a possible family of functions 
determined by a parameter vector /?. This example, however, will only illustrate a pro- 
cedure for adjusting the decision rules E, and not /?. For simplicity of notation we 
therefore initialise all values in ~p to one. We then have: 



With this choice of fi the possible output values for S c are the integers from 0 to n lut 
(both inclusive). 

The leave-one-out cross-validation score or vote-count on a given class c is: 



where C T (x) denotes the true class of example x . 

For all possible inter-class combinations (c,,c 2 ), (c, e{l,2,...W c },c 2 e{\X-K)) A (q ^ c i) 
we wish to determine a suitable decision border in the score space spanned by the two 
classes. The matrix B c > Cj is defined to contain the decisions corresponding to a given 
set of decision rules applied to the two corresponding output score values; i.e whether 
class c x or class c 2 wins. The row and column dimensions are given by the allowed 
ranges of the two output score values, i.e. the matrix dimension is (N LUT + l)x{N Lur + l). 
Accordingly, the row and column indexes run from 0 to N LUT . 

Each matrix element contains one of the following three values: c, ,c 2 and k^, where 
k M B is a constant different from c,and c 2 . Here we use k MB = 0. The two output score 
values 5, and S 2 obtained for class c, and class c 2 , respectively, are used to address the 
element in the matrix B^ 2 . If the addressed element contains the value c } it 
means that class c, wins over class c 2 . If the addressed element contains the value c, it 



S e (v M , ) x:) = X e *( v -.« Ji >')- 



r c (30 = 2> 




SUBSTITUTE SHEET (RULE 26) 



WO 99/67694 




PCT/DK99/00340 



21 

means that class c 2 wins over class c x . Finally, if the addressed element contains the 
value k MB , it means the decision is ambiguous. 

The decision rules are initialised to correspond to a WTA decision. This corresponds to 
having a decision border along the diagonal in the matrix B c, Cj . Along the diagonal the 
elements are initialised to take on the value . Above and respectively below the 
diagonal the elements are labelled with opposite class values. 

A strategy for adjusting the initialised decision border according to an information 
measure that uses the v a (nc values is outlined below. 

Create the cost matrix M c,,C2 with elements given as: 

a denotes the cost associated with classifying an example from class c, in to 
class c 2 and a CiXx denotes the cost associated with the opposite error. It is here 
assumed that a logical true evaluates to one and a logical false evaluates to zero. 

A minimal-cost path from m Qt0 to m NufftNwr can be calculated using e.g. a dy- 
namic programming approach as shown by the following pseudo-code: (the code 
uses a path matrix P c " Cj with the same dimensions as B C " C2 ) 

// Loop through all entries in the cost matrix in reverse order: 

for i : = Nlut to 0 step -1 
{ 

for j : - Nlut to 0 step -1 

{ 

if ((i < > Nlut) and (j <> Nlut)) 
{ 

// For each entry, calculate the lowest 
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//associated total-costs given as 

mi.i : = mi.j + min(rrii + mi+ij+i, rruj+i); 

// (Indexes outside the matrix are considered 

// as addressing the value of infinity) 

if ( mindrii + i.j, mui,j + i, mi.j+i) = - mi + ij ) p\,\ : - 1; 
if ( min(mi + i.j, mi + i.j-n, mi,j+i) = = mi + ij+i ) p\,\ : = 2; 
if ( min(mi^i,j / mi + u+i, mij + i) = = mu+i ) pij := 3; 



} 



10 } 

} 



//According to the dynamic programming approach the path 
15 //with the smallest associated total-cost is now obtained 

//by traversing the P-matrix in the following manner to obtain 
//the decision border in the score space spanned by the 
//classes in question. 

20 i:-0; 

j :- 0; 

repeat 

{ 

25 </' J :=0; 

for a : - i + 1 to Niut step 1 
{ 

> 

30 for a : = j + 1 to NIlut step 1 

{ 

} 
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iold : = i; 
jold := j; 

if (p»oid,joid < 3) then i := iold +1; 
5 if (pioidjoid > 1) then j := jold +1; 

} until (i = = Nlut and j = = Nlut); 



The dynamic programming approach can be extended with regularisation terms, which 
10 constraint the shape of the border. 

An alternative method for determining the decision border could be to fit a B-spline 
with two control points in such a way that the associated cost is minimised. 

Using the decision borders determined from the strategy outlined above an example 
15 can now be classified in the following manner: 

• Present the example to the network in order to obtain the score values or vote 
numbers S c {x) = 2 e 'K<^) 

• Define a new set of score values d c for all classes and initialise the scores to zero: 
d c = 0, \ <c<N c . 

20 • Loop through all possible inter-class combinations, (c,,c 2 ), and update the vote- 
values: d '= d h ™ +1 

• The example is now classified as belonging to the class with the label found from 
argmax(</ c ). 

c 

25 A leave-one-out cross-validation test using the decision borders determined from the 
strategy outlined above is obtained in the following manner: 

• Present the example to the network in order to obtain the leave-one-out score val- 
ues or vote numbers r c (Jr) = 2 0 ^ r .o,K<^) 

• Define a new set of score values d c for all classes and initialise the scores to zero: 
30 d e =0,\<c<N c . 
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• Loop through all possible inter-class combinations, (c,,c 2 ), and update the vote- 
values: d : = d b d*i +1 

• The example is now classified as belonging to the class with the label found from 
argmax(^ c ) . 

c 

With reference to Figure 5 the above adjustment procedure for the decision rules (bor- 
ders) E may be described as 

• Initialise the system by setting all values of p to one, selecting a WTA scheme on a 
two by two basis and by training the n-tuple classifier according to the flow chart in 
Fig. 4. (5.0) 

• Batch mode optimisation is chosen. (5.1) 

• Test all examples by performing a leave-one-out classification as outline above 
(5.12) and calculate the obtained leave-one-out cross-validation error rate and use it 
as the Q c -measure. (5.13) 

• Store the values of and the corresponding Rvalue as well as the H-rules (the 
B C " C2 matrices). (5.14) 

• If the £? c -value does not satisfy a given criterion or another stop criterion is met 
then adjust the E-rules according to the dynamic programming approach outline 
above. (5.16, 5.15) 

• If the £? c -value is satisfied or another stop criterion is met then select the combina- 
tion with the lowest total error-rate. (5.17) 

In the above case one would as alternative stop criterion use a criterion that only al- 
lows two loops through the adjustment scheme. 



Example 2 

This example illustrates an optimisation procedure for adjusting /?. 
For each class we again define a single output score 

M v Vu>.^) = Xe*> M ,,c). 
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With these score values the example is now classified as belonging to the class with the 
label found from argmax(S c ). 

c 

In this example we use fl = (k x ,k 2 ,...,k Ne ).We also initialise the E rules to describe a 
WTA decision when comparing the output scores from the different classes. 

• Initialise the system by setting all Rvalues to one ' selecting a WTA scheme and by 
training the n-tuple classifier according to the flow chart in Fig. 4. (5.0) 

• Batch mode optimisation is chosen. (5.1) 

• Test all examples using a leave-one-out cross-validation test (5.1 2) and calculate the 
obtained leave-one-out cross-validation error rate used as Q G . (5.13) 

• Store the values of ~/3 and the corresponding Q c value. (5.14) 

• Loop through all possible combinations of * q ,* ffj ,K ,/r CVf where kj e{l,2,3,.»*MA* • 

(5.16, 5.15) 

• Select the combination with the lowest total error-rate. (5.17) 

For practical use, the * M ^-value will depend upon the skewness of the class priors and 
the number of address-lines used in the RAM net system. 

Example 3 

This example also illustrates an optimisation procedure for adjusting fi but with the use 
of a local quality function Q L . 

For each class we now define as many output scores as there are competing classes, i.e. 
N e - \ output scores: 

S Ci . Cl (v a , u , C/ = (v M ,, r , ), Vk*j. 

/en 

With these score values a decision is made in the following manner 
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• Define a new set of score values d c for all classes and initialise the scores to zero: 
d c = 0, \<c<N c . 

• Loop through all possible inter-class combinations, (c,,^) , and update the vote- 
values: 

,f s w > s c 2 .v t then d Ci :=d c . + 1 else d Cz := d C: + 1 . 

• The example is now classified as belonging to the class with the label found from 
argmax(^ c ) . 

c 

In this example we use 

£ = ^ A,,^ ,k CiXt , k Cy ^ CXc i ). 

We also initialise the E rules to describe a WTA decision when comparing the output 
scores from the different classes. 

• Initialise the system by setting all A,.^ -values to say two, selecting a WTA scheme 
and by training the n-tuple classifier according to the flow chart in Fig. 4. (5.0) 

• On line mode as opposed to batch mode optimisation is chosen. (5.1) 

• For all examples in the training set (5.2, 5.7, and 5.8) do: 

• Test each example to obtain the winner class C w in a leave-one-crossvalidation. Let 
the Q L - measure compare C lv with the true class C r . (5.3,5.4) 

• If C w * C T a leave-one-out error is made so the values of k CwJCj and A are ad- 
justed by incrementing k CwCj with a small value, say 0.1, and by decrementing 
*t,,t„ with a smal1 value, say 0.05. If the adjustment will bring the values below 
one, no adjustment is performed. (5.5,5.6) 

• When all examples have been processed the global information measure Q c (e.g. 
the leave-one-out-error-rate) is calculated and the values of /? and O a are stored. 
(5.9,5,10) 
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• If Q Ct or another stop criterion is not fulfilled the above loop is repeated. (5.1 1) 

• If Qg is satisfied or another stop criterion is fulfilled the best value of the stored O c - 
values are chosen together with the corresponding parameter values ~p and deci- 
sion rules H. (5 .1 7,5.18) 

The foregoing description of preferred exemplary embodiments of the invention has 
been presented for the purpose of illustration and description. It is not intended to be 
exhaustive or to limit the invention to the precise form disclosed, and obviously many 
modifications and variations are possible in light of the present invention to those 
skilled in the art. All such modifications which retain the basic underlying principles 
disclosed and claimed herein are within the scope of this invention. 
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CLAIMS 

1 . A method of training a computer classification system which can be de- 

fined by a network comprising a number of n-tuples or Look Up Tables (LUTs), with 
5 each n-tuple or LUT comprising a number of rows corresponding to at least a subset of 
possible classes and further comprising a number of columns being addressed by sig- 
nals or elements of sampled training input data examples, each column being defined 
by a vector having cells with values, said method comprising 

determining the column vector cell values based on one or more training 
10 sets of input data examples for different classes so that at least part of the cells comprise 
or point to information based on the number of times the corresponding cell address is 
sampled from one or more sets of training input examples, and 

determining one or more output score functions for evaluation of at least one 
output score value per class, and/or 
15 determining one or more decision rules to be used in combination with at least 

part of the obtained output scores to determine a winning class, 

said output score functions and/or decision rules being determined based on 
the information of at least part of the determined column vector cell values. 

20 2. A method according to claim 1 , wherein the output score functions 

and/or the decision rules are determined based on a validation set of input data exam- 
ples. 

3. A method according to claim 2, wherein the validation set comprises at 
25 least part of the training set(s) of input data examples. 

4. A method according to any of the claims 1-3, wherein the output score 
functions are determined by a set of parameter values. 

30 5. A method according to any of the claims 1-4, wherein determination of 

the output score functions and/or the decision rules is based on an information measure 
evaluating the performance on the validation example set, said evaluating measure 
preferably being a leave-one-out cross validation test. 
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6. A method according to any of the claims 1-5, wherein an output score 
space is given by the output score variable containing the output score values, and 
the decision rules define regions in the output score space to be addressed by obtained 
output score values to obtain a winning class. 

5 

7. A method according to any of the claims 1 -6, wherein determination of 
the output score functions and/or the decision rules comprises initialising the output 
score functions and/or the decision rules. 

10 8. A method according to claim 7, wherein the initialisation of the output 

score functions comprises determining a number of set-up parameters. 

9. A method according to claims 7 or 8, wherein the initialisation of the 
output score functions comprises setting all output score functions to a pre-determined 

15 mapping function. 

10. A method according to any of the claims 7-9, wherein the initialisation of 
the decision rules comprises setting the rules to a pre-determined decision scheme. 

20 11. A method according to any of the claims 1-10, further comprising adjust- 

ing the output score functions and/or the decision rules, said adjustment preferably 
being based on an information measure evaluation. 

12. A method according to claim 1 1, wherein said information measure 
25 evaluation is a leave-one-out cross validation test. 

13. A method according to claim 8 and any of the claims 11-12, wherein the 
adjustment comprises changing the values of the set-up parameters. 

30 14. A method according to any of the claims 1-13, wherein the determination 

of the column vector cell values comprises the training steps of 

a) applying a training input data example of a known class to the classifica- 

tion network, thereby addressing one or more column vectors, 
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b) incrementing, preferably by one, the value or vote of the cells of the ad- 
dressed column vector(s) corresponding to the row(s) of the known class, 
and 

c) repeating steps (a)-(b) until all training examples have been applied to the 
network. 



15. A method according to any of the claims 11-14, wherein the adjustment 
process comprises the steps of 

determining a global quality value based on at least part of the column vector 
cell values, 

determining if the global quality value fulfils a required quality criterion, and 
adjusting at least part of output score functions and/or part of the decision rules 
until the global quality criterion is fulfilled. 

16. A method according to claim any of the claims 11-15, wherein the ad- 
justment process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined validation 
input examples, 

f) determining a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 
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1 7. A method according to claim 1 6, wherein steps (b)-(d) are carried out for 

all examples of the validation set(s). 



1 8. A method according to any of the claims 1 5-1 7, wherein the local and/or 
global quality value is defined as functions of at least part of the column cells. 

1 9. A method according to any of the claims 1 5-1 8, wherein the adjustment 
iteration process is stopped if the quality criterion is not fulfilled after a given number 
of iterations. 



20. A method of classifying input data examples into at least one of a plural- 
ity of classes using a computer classification system configured according to any of the 
claims 1-19, whereby column cell values for each n-tuple or LUT and output score 
functions and/or decision rules are determined using on one or more training or valida- 

1 5 tion sets of input data examples, said method comprising 

a) applying an input data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n-tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of output 
score functions and decision rules thereby addressing specific rows in the set of 

20 n-tuples or LUTs, 

c) determining output score values as a function of the column vector cells and us- 
ing the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, and 

e) selecting the class or classes that win(s) according to the decision rules. 

25 

21. A system for training a computer classification system which can be de- 
fined by a network comprising a stored number of n-tuples or Look Up Tables (LUTs), 
with each n-tuple or LUT comprising a number of rows corresponding to at least a sub- 
set of possible classes and further comprising a number of columns being addressed by 

30 signals or elements of sampled training input data examples, each column being de- 
fined by a vector having cells with values, said system comprising 

a) input means for receiving training input data examples of known classes, 

b) means for sampling the received input data examples and addressing column vec- 
tors in the stored set of n-tuples or LUTs, 
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c) means for addressing specific rows in the set of n-tuples or LUTs, said rows corre- 
sponding to a known class, 

d) storage means for storing determined n-tuples or LUTs, 

e) means for determining column vector cell values so as to comprise or point to in- 
5 formation based on the number of times the corresponding cell address is sampled 

from the training set(s) of input examples, and 

f) means for determining one or more output score functions and/or one or more 
decision rules, said output score functions and/or decision rules determining 
means being adapted to determine said functions and/or rules based on the infor- 

10 mation of at least part of the determined column vector cell values. 

22. A system according to claim 21, wherein the means for determining the 
output score functions is adapted to determine such functions from a family of output 
score functions determined by a set of parameter values. 

15 

23. A system according to claim 21 or 22, wherein the means for determin- 
ing the output score functions and/or the decision rules is adapted to determine such 
functions and/or rules based on a validation set of input data examples of known 
classes, said validation set preferably comprising at least part of the training set(s) used 

20 for determining the column cell values. 

24. A system according to any of the claims 21-23, wherein the means for 
determining the output score functions and decision rules comprises 

means for initialising one or more sets output score functions and/or decision rules, and 
25 means for adjusting output score functions and decision rules by use of at least part of 
the validation set of input examples. 

25. A system according to any of the claims 21-24, wherein the means for 
determining the column vector cell values is adapted to determine these values as a 

30 function of the number of times the corresponding cell address is sampled from the 
set(s) of training input examples. 

26. A system according to any of the claims 21-25, wherein, when a training 
input data example belonging to a known class is applied to the classification network 
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thereby addressing one or more column vectors, the means for determining the column 
vector ceil values is adapted to increment the value or vote of the cells of the addressed 
column vector(s) corresponding to the row(s) of the known class, said value preferably 
being incremented by one. 

5 

27. A system according to any of the claims 24-26, wherein the means for 

adjusting output score functions and/or decision rules is adapted to 

determine a global quality value based on at least part of column vector cell 

values, 

10 determine if the global quality value fulfils a required global quality criterion, 

and 

adjust at least part of the output score functions and/or decision rules until the 
global quality criterion is fulfilled. 

15 28. A system according to any of the claims 24-27, wherein the means for 

adjusting output score functions and decision rules is adapted to 
a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed vector cell values, 
20 b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) repeat the local quality test for a predetermined number of training input exam- 
ples, 

25 e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
30 fulfilled. 

29. A system according to any of the claims 27 or 28, wherein the means for 

adjusting the output score functions and decision rules is further adapted to stop the 
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iteration process if the global quality criterion is not fulfilled after a given number of 
iterations. 

30. A system according to any of the claims 21-29, wherein the means for 
5 storing n-tuples or LUTs comprises means for storing adjusted output score functions 

and decision rules and separate means for storing best so far output score functions and 
decision rules or best so far classification system configuration values. 

31 . A system according to claim 30, wherein the means for adjusting the 

10 output score functions and decision rules is further adapted to replace previously sepa- 
rately stored best so far output score functions and decision rules with obtained ad- 
justed output score functions and decision rules if the determined global quality value 
is closer to fulfil the global quality criterion than the global quality value corresponding 
to previously separately stored best so far output score functions and decision rules. 

15 

32. A system for classifying input data examples of unknown classes into at 
least one of a plurality of classes, said system comprising: 

storage means for storing a number or set of n-tuples or Look Up Tables (LUTs) 
with each n-tuple or LUT comprising a number of rows corresponding to at 

20 least a subset of the number of possible classes and further comprising a num- 

ber of column vectors, each column vector being addressed by signals or ele- 
ments of a sampled input data example, and each column vector having cell 
values being determined during a training process based on one or more sets of 
training input data examples, 

25 storage means for storing one ore more output score functions and/or one or 

more decision rules, each output score function and/or decision rule being de- 
termined during a training or validation process based on one or more sets of 
validation input data examples, said system further comprising: 
input means for receiving an input data example to be classified, 

30 means for sampling the received input data example and addressing column 

vectors in the stored set of n-tuples or LUTs, 

means for addressing specific rows in the set of n-tuples or LUTs, said rows cor- 
responding to a specific class, 
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means for determining output score values using the stored output score func- 
tions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score val- 
ues and stored decision rules. 

5 

33. A system according to claim 32, wherein the cell values of the column 

vectors and the output score functions and/or decision rules of the classification system 
are determined by use of a training system according to any of the claims 21-31 . 

10 34. A system according to claim 32, wherein the column vector cell values 

and the output score functions and/or decision rules are determined during a training 
process according to any of the claims 1-19. 

15 



20 
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Re Item V 

Reasoned statement under Article 35(2) with regard to novelty, inventive step or 
industrial applicability; citations and explanations supporting such statement 

1 . Reference is made to the following document: 

D1: T. MARTINI J0RGENSEN: "Classification of handwritten digits using a neural 
net architecture", INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, vol. 8, 
no. 1, February 1997 

2. Independent claims 1 and 26 relate to a method and system for training a computer 
classification system e.g. for pattern recognition. 

3. The following features of the application, and particularly of the preambles of claims 
1 and 26, are known from the prior art. 

3.1 Lines 1 to 6 of both claims 1 and 26 recite a (RAM-based neural) network including 
a number of LUTs (Look Up Tables), said LUTs being addressed by (digitized) input 
signals (or parts thereof) and storing data collectively forming a large matrix whose 
columns are addressed by said input signal (or part thereof) and whose rows 
correspond to at least a subset of possible classes (in which the input signals can be 
classified, e.g. for pattern recognition purposes). 

Such a network of LUTs is completely anticipated by D1, cf. figure 1 thereof. Like in 
the present application an address selector uses at least some of the (digital) signals 
of an input vector to be classified to address one column of the matrix distributively 
stored in the LUTs (cf. D1, page 19, right column, line 2), whilst the matrix rows 
correspond to different object classes (cf. D1, page 18, left column, par. 2). 

3.2 The claims further recite (claim 1, lines 7-10; claim 26, lines 7-16) that during a 
training phase one or more training sets of input signals are fed to the network so that 
each time an input signal (of which the corresponding class is known a priori) selects 
a matrix column, the column cells corresponding to the one or more classes are 
"sampled"; thus, at least part of the matrix cells of a so trained network "comprise" 
or point to information relating to the number of times a given cell has been 
"sampled" (i.e. addressed) during training. 
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Such a training of the network of LUTs is however described in D1 as a "simple one 
pass learning scheme", cf. D1, page 19, left column, last two lines, and right column, 
lines 1-5. In such scheme, the general idea is storing the numbers of training inputs 
(or examples) visiting a matrix cell, cf. D1, page 20, right column, lines 16-17 . 

4. The invention is characterised over the prior art by the features recited in the 
characterising portion of claims 1 and 26, according to which the output score 
functions and/or the decision rules are determined based on the information of at 
least part of the determined column vector cells, and that the output score functions 
and/or decision rules are adjusted based on an information measure evaluation. 
The problem is to find suitable score functions and/or decision rules. The present 
invention describes a solution to finding such scores and rules. The numbers kept in 
the column vector cells are the key information that is used in determining either 
adequate score functions to be used with a given set of decision rules or in 
determining adequate decision rules to be used in combination with a given set of 
score functions or as a final possibility for finding at the same time adequate decision 
rules and score functions. 

As illustrated by the three examples in the description of the present application the 
normal strategy will be to start up with an initial set of score functions and an initial 
choice of decision rules. Using a validation set of examples one can for each of these 
examples read out the addressed column vector values, which are then used to 
evaluate an information measure measuring the performance of the present choice 
of score functions and decision rules. In case the obtained performance does not 
satisfy a given performance constraint one can then adjust the score functions and/or 
the decision rules and evaluate once again the performance. 

If a performance satisfying a given constraint is achieved, the corresponding score 
functions and decision rules are kept. Otherwise, after a given set of iterations, the 
set of score functions and decision rules that gives the best performance will normally 
be chosen. The adjustment simply involves changing the score functions and/or the 
decision rules. 

5. Independent claims 25 and 40 relate to a method and system for using the thus 
trained computer classification system wherein the plurality of determined and 
possibly adjusted output score functions and decision rules provided as defined in 
claims 1 and 26 are used as opposed to the conventional approach of using one 
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predetermined output score function and one predetermined decision rule only. 

6. Consequently, the subject-matter set out in the present claims, and particularly in 
claims 1 and 26, and in claims 25 and 40 is considered to be novel and non-obvious 
with respect to the disclosures of the available prior art. It is also evident that the 
invention is industrially applicable. 

The requirements of paragraphs (1) to (4) of Article 33 PCT are thus met. 
Re Item VII 

Certain defects in the international application 

7. The opening part of the description should have been modified to bring it into 
agreement with any amended independent claim, Rule 5.1 (a) (iii) PCT. 

All claims should have included, whenever possible, reference signs relating to the 
technical features referred to therein, Rule 6.2 (b) PCT. 

Contrary to the requirements of Rule 5.1(a)(ii) PCT, the relevant background art 
disclosed in document D1 is not mentioned in the description, nor is this document 
identified therein. 

At line 4 of claim 25, "using on one" should read "using one". 
Re Item VIII 

Certain observations on the international application 

8. Although the overall invention can be understood from the application and its claims, 
present independent claim 26 is in part worded in a confusing manner, so that the 
requirements of Article 6 PCT are not completely met. 

Independent method claim 25 is not rendered dependent by the reference to claims 
1-24, as it is aimed at a method for using (i.e. "classifying input data examples") the 
network trained as set out in the preceding claims (the same would have been true 
for claim 40 if it contained such a reference). 

The claim does not make sufficiently clear, however, that the "given set of output 
score functions and decision rules" recited at lines 8-9 are those in fact those 
determined and possibly adjusted during training as set out in said preceding claims 
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1-24. 

Claim 25 could have been clarified, for instance, by deleting "configured" at line 2 
(claims 1-24 set out a method for training a computer classification system), shifting 
"according to any of the claims 1-24" after "examples", line 5, and replacing: 
"determined using on one" (line 4) with "determined and possibly adjusted using 
one", as well as "a given set of" (line 8), and "the determined" (lines 12 and 13) with 
"said". 

To some extent, similar amendments could have been made to independent 
apparatus claim 40. 
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CLAIMS 



1. 



A method of training a computer classification system which can be 



defined by a network comprising a number of n-tuples or Look Up Tables (LUTs), 
with each n-tuple or LUT comprising a number of rows corresponding to at least a 
subset of possible classes and further comprising a number of columns being ad- 
dressed by signals or elements of sampled training input data examples, each col- 
umn being defined by a vector having cells with values, wherein 



training sets of input data examples for different classes so that at least part of the 
cells comprise or point to information based on the number of times the corre- 
sponding cell address is sampled from one or more sets of training input examples, 
said method being characterised in that 

one or more output score functions are determined for evaluation of at least 
one output score value per class, and 

one or more decision rules are determined to be used in combination with at 
least part of the obtained output scores to determine a winning class, wherein said 
determination of the output score functions and decision rules comprises 

determining output score functions based on the information of at least part 
of the determined column vector cell values, and adjusting at least part of the output 
score functions based on an information measure evaluation, and/or 

determining decision rules based on the information of at least part of the 
determined column vector cell values, and adjusting at least part of the decision 
rules based on an information measure evaluation. 

2. A method according to claim 1 , wherein the output score functions are 
determined based on a validation set of input data examples. 

3. A method according to claim 1 or 2, wherein the decision rules are 
determined based on a validation set of input data examples. 

4. A method according to any of the claims 1-3, wherein determination of 
the output score functions is based on an information measure evaluating the per- 
formance on the validation example set. 



the column vector cell values are determined based on one or more 
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5. A method according to any of the claims 1-4, wherein determination of 

the decision rules is based on an information measure evaluating the performance 
on the validation example set. 

5 6. A method according to any of the claims 3-5, wherein the validation 

example set equals at least part of the training set and the information measure is 
based on a leave-one-out cross validation evaluation. 

7. A method according to any of the claims 3-6, wherein the validation set 
1 0 comprises at least part of the training set(s) of input data examples. 

8. A method according to any of the claims 1-7, wherein the output score 
functions are determined by a set of parameter values. 

15 9. A method according to any of the claims 1-8, wherein determination of 

the output score functions comprises initialising the output score functions. 

10. A method according to claim 9, wherein the initialisation of the output 
score functions comprises determining a number of set-up parameters. 

20 

11. A method according to claims 9 or 10, wherein the initialisation of the 
output score functions comprises setting all output score functions to a pre- 
determined mapping function. 

25 12. A method according to any of the claims 1-11, wherein determination 

of the decision rules comprises initialising the decision rules. 

13. A method according to claim 12, wherein the initialisation of the deci- 
sion rules comprises setting the rules to a pre-determined decision scheme. 

30 

14. A method according to any of the claims 1 0-1 3, wherein the adjust- 
ment comprises changing the values of the set-up parameters. 

15. A method according to any of the claims 1 -1 4, wherein the determina- 
35 tion of the column vector cell values comprises the training steps of 
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a) applying a training input data example of a known class to the classifi- 
cation network, thereby addressing one or more column vectors, 

b) incrementing, preferably by one, the value or vote of the cells of the 
addressed column vector(s) corresponding to the row(s) of the known 
class, and 

c) repeating steps (a)-(b) until all training examples have been applied to 
the network. 

16. A method according to any of the claims 1 -1 5, wherein the adjustment 
process comprises the steps of 

determining a global quality value based on at least part of the column vector 

cell values, 

determining if the global quality value fulfils a required quality criterion, 

and 

adjusting at least part of the output score functions until the global quality 
criterion is fulfilled. 

17. A method according to any of the claims 1-16, wherein the adjustment 
process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation 
input example, the local quality value being a function of at least part of the 
addressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the output score functions if the local quality crite- 
rion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined valida- 
tion input examples, 

f) determining a global quality value based on at least part of the column vec- 
tors being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 
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h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 

18. A method according to any of the claims 1-17, wherein the adjustment 
process comprises the steps of 

determining a global quality value based on at least part of the column vector 
cell values, 

determining if the global quality value fulfils a required quality criterion, 

and 

adjusting at least part of the decision rules until the global quality criterion is 
fulfilled. 

1 9. A method according to any of the claims 1 -1 8, wherein the adjustment 
process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation 
input example, the local quality value being a function of at least part of the 
addressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the decision rules if the local quality criterion is not 
fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined valida- 
tion input examples, 

f) determining a global quality value based on at least part of the column vec- 
tors being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 

20. A method according to any of the claims 1-15, wherein the adjustment 
process comprises the steps of 

. determining a global quality value based on at least part of the column vector 
cell values, 
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determining if the global quality value fulfils a required quality criterion, 

and 

adjusting at least part of the output score functions and part of the decision 
rules until the global quality criterion is fulfilled. 

21 . A method according to any of the claims 1 -1 5 or 20, wherein the ad- 

justment process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation 
input example, the local quality value being a function of at least part of the 
addressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the output score functions and the decision rules if 
the local quality criterion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined valida- 
tion input examples, 

f) determining a global quality value based on at least part of the column vec- 
tors being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 



22. A method according to claim 17, 19 or 21 , wherein steps (b)-(d) are 
carried out for all examples of the validation set(s). 

23. A method according to any of the claims 1 6-22, wherein the local 
and/or global quality value is defined as functions of at least part of the column cells. 

24. A method according to any of the claims 1 6-23, wherein the adjust- 
ment iteration process is stopped if the quality criterion is not fulfilled after a given 
number of iterations. 
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25. A method of classifying input data examples into at least one of a plu- 

rality of classes using a computer classification system configured according to any 
of the claims 1-24, whereby column cell values for each n-tuple or LUT and output 
score functions and/or decision rules are determined using on one or more training 
or validation sets of input data examples, said method comprising 

a) applying an input data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n-tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of out- 
put score functions and decision rules thereby addressing specific rows in the 
set of n-tuples or LUTs, 

c) determining output score values as a function of the column vector cells and 
using the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, 

and 

e) selecting the class or classes that win(s) according to the decision rules. 

26. A system for training a computer classification system which can be 

defined by a network comprising a stored number of n-tuples or Look Up Tables 
(LUTs), with each n-tuple or LUT comprising a number of rows corresponding to at 
least a subset of possible classes and further comprising a number of columns be- 
ing addressed by signals or elements of sampled training input data examples, each 
column being defined by a vector having cells with values, said system comprising 

a) input means for receiving training input data examples of known 
classes, 

b) means for sampling the received input data examples and addressing col- 
umn vectors in the stored set of n-tuples or LUTs, 

c) means for addressing specific rows in the set of n-tuples or LUTs, said rows 
corresponding to a known class, 

d) storage means for storing determined n-tuples or LUTs, 

e) means for determining column vector cell values so as to comprise or point 
to information based on the number of times the corresponding cell address 
is sampled from the training set(s) of input examples, characterised in that 
said system further comprises 
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means for determining one or more output score functions and one or more 



decision rules, wherein said output score functions and decision rules deter-, 
mining means is adapted for 

determining said output score functions based on the information of at least 
part of the determined column vector cell values and a validation set of input 
data examples of known classes, and 

determining said decision rules based on the information of at least part of 
the determined column vector cell values and a validation set of input data 
examples of known classes, and wherein the means for determining the out- 
put score functions and decision rules comprises 

means for initialising one or more sets of output score functions and/or deci- 
sion rules, and 

means for adjusting output score functions and decision rules by use of at 
least part of the validation set of input examples. 

27. A system according to claim 26, wherein the means for determining 
the output score functions is adapted to determine such functions from a family of 
output score functions determined by a set of parameter values. 

28. A system according to claim 26 or 27, wherein said validation set com- 
prises at least part of the training set(s) used for determining the column cell values. 

29. A system according to any of the claims 26-28, wherein the means for 
determining the column vector cell values is adapted to determine these values as a 
function of the number of times the corresponding cell address is sampled from the 
set(s) of training input examples. 

30. A system according to any of the claims 26-29, wherein, when a train- 
ing input data example belonging to a known class is applied to the classification 
network thereby addressing one or more column vectors, the means for determining 
the column vector cell values is adapted to increment the value or vote of the cells of 
the addressed column vector(s) corresponding to the row(s) of the known class, said 
value preferably being incremented by one. 
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31 . A system according to any of the claims 26-30, wherein the means for 

adjusting output score functions is adapted to 

determine a global quality value based on at least part of column vector cell 

values, 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust at least part of the output score functions until the global quality crite- 
rion is fulfilled. 



32. A system according to any of the claims 26-31 , wherein the means for 
adjusting output score functions and decision rules is adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions if the local quality criterion is 
not fulfilled, 

d) repeat the local quality test for a predetermined number of training input ex- 
amples, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
fulfilled. 

33. A system according to any of the claims 26-32, wherein the means for 
adjusting decision rules is adapted to 

determine a global quality value based on at least part of column vector cell 
values, 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust at least part of the decision rules until the global quality criterion is ful- 
filled. 
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A system according to any of the claims 26-33, wherein the means for 



adjusting output score functions and decision rules is adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the decision rules if the local quality criterion is not 
fulfilled, 

d) repeat the local quality test for a predetermined number of training input ex- 
amples, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
fulfilled. 

35. A system according to any of the claims 26-30, wherein the means for 
adjusting decision rules is adapted to 

determine a global quality value based on at least part of column vector cell 

values, . 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust least part of the output score functions and decision rules until the 
global quality criterion is fulfilled. 

36. A system according to any of the claims 26-30 or 35, wherein the 
means for adjusting output score functions and decision rules is adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and decision rules if the 
. local quality criterion is not fulfilled, 
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d) repeat the local quality test for a predetermined number of training input ex- 
amples, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
fulfilled. 

37. A system according to any of the claims 31-36, wherein the means for 
adjusting the output score functions and decision rules is further adapted to stop the 
iteration process if the global quality criterion is not fulfilled after a given number of 
iterations. 

38. A system according to any of the claims 26-37, wherein the means for 
storing n-tuples or LUTs comprises means for storing adjusted output score func- 
tions and decision rules and separate means for storing best so far output score 
functions and decision rules or best so far classification system configuration values. 

39. A system according to claim 38, wherein the means for adjusting the 
output score functions and decision rules is further adapted to replace previously 
separately stored best so far output score functions and decision rules with obtained 
adjusted output score functions and decision rules if the determined global quality 
value is closer to fulfil the global quality criterion than the global quality value corre- 
sponding to previously separately stored best so far output score functions and de- 
cision rules. 

40. A system for classifying input data examples of unknown classes into 
at least one of a plurality of classes, said system comprising: 

storage means for storing a number or set of n-tuples or Look Up Tables 
(LUTs) with each n-tuple or LUT comprising a number of rows corresponding 
to at least a subset of the number of possible classes and further comprising 
a number of column vectors, each column vector being addressed by signals 
or elements of a sampled input data example, and each column vector hav- 
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ing cell values being determined during a training process based on one or 
more sets of training input data examples, 

storage means for storing one ore more output score functions and/or one or 
more decision rules, each output score function and/or decision rule being 
5 determined during a training or validation process based on one or more sets 

of validation input data examples, said system further comprising: 

input means for receiving an input data example to be classified, 
means for sampling the received input data example and addressing column 
vectors in the stored set of n-tuples or LUTs, 
10 means for addressing specific rows in the set of n-tuples or LUTs, said rows 

corresponding to a specific class, 

means for determining output score values using the stored output score 
functions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score 
15 values and stored decision rules. 

41 . A system according to claim 40, wherein the cell values of the column 
vectors and the output score functions and/or decision rules of the classification 
system are determined by use of a training system according to any of the claims 

20 26-39. 

42. A system according to clain>40, wherein the column vector cell values 
and the output score functions and/or decision rules are determined during a training 
process according to any of the cfaims 1 -24. 

25 
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1 . A method of training a computer classification system which can be de- 

fined by a network comprising a number of n-tuples or Look Up Tables (LUTs), with 
5 each n-tuple or LUT comprising a number of rows corresponding to at least a subset of 
possible classes and further comprising a number of columns being addressed by sig- 
nals or elements of sampled training input data examples, each column being defined 
by a vector having cells with values, said method comprising 

determining the column vector cell values based on one or more training 
1 0 sets of input data examples for different classes so that at least part of the cells comprise 
or point to information based on the number of times the corresponding cell address is 
sampled from one or more sets of training input examples, and 

determining one or more output score functions for evaluation of at least one 
output score value per class, and/or 
15 determining one or more decision rules to be used in combination with at least 

part of the obtained output scores to determine a winning class, 

said output score functions and/or decision rules being determined based on 
the information of at least part of the determined column vector cell values. 

20 2. A method according to claim 1, wherein the output score functions 

and/or the decision rules are determined based on a validation set of input data exam- 
ples. 

3. A method according to claim 2, wherein the validation set comprises at 
25 least part of the training set(s) of input data examples. 

4. A method according to any of the claims 1-3, wherein the output score 
functions are determined by a set of parameter values. 

30 5. A method according to any of the claims 1-4, wherein determination of 

the output score functions and/or the decision rules is based on an information measure 
evaluating the performance on the validation example set, said evaluating measure 
preferably being a leave-one-out cross validation test. 
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6. A method according to any of the claims 1-5, wherein an output score 
space is given by the output score variable containing the output score values, and 
the decision rules define regions in the output score space to be addressed by obtained 
output score values to obtain a winning class. 

5 

7. A method according to any of the claims 1 -6, wherein determination of 
the output score functions and/or the decision rules comprises initialising the output 
score functions and/or the decision rules. 

10 8. A method according to claim 7, wherein the initialisation of the output 

score functions comprises determining a number of set-up parameters. 

9. A method according to claims 7 or 8, wherein the initialisation of the 
output score functions comprises setting all output score functions to a p re-determined 

1 5 mapping function. 

10. A method according to any of the claims 7-9, wherein the initialisation of 
the decision rules comprises setting the rules to a pre-determined decision scheme. 

20 11. A method according to any of the claims 1 -1 0, further comprising adjust- 

ing the output score functions and/or the decision rules, said adjustment preferably 
being based on an information measure evaluation. 

12. A method according to claim 1 1, wherein said information measure 
25 evaluation is a leave-one-out cross validation test. 

1 3. A method according to claim 8 and any of the claims 11-12, wherein the 
adjustment comprises changing the values of the set-up parameters. 

30 14. A method according to any of the claims 1-13, wherein the determination 

of the column vector cell values comprises the training steps of 

a) applying a training input data example of a known class to the classifica- 

tion network, thereby addressing one or more column vectors, 
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b) incrementing, preferably by one, the value or vote of the cells of the ad- 
dressed column vector(s) corresponding to the row(s) of the known class, 
and 

c) repeating steps (a)-(b) until all training examples have been applied to the 
network. 

15. A method according to any of the claims 11-14, wherein the adjustment 
process comprises the steps of 

determining a global quality value based on at least part of the column vector 
cell values, 

determining if the global quality value fulfils a required quality criterion, and 
adjusting at least part of output score functions and/or part of the decision rules 
until the global quality criterion is fulfilled. 

16. A method according to claim any of the claims 11-15, wherein the ad- 
justment process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined validation 
input examples, 

f) determining a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (a)-(g) until the global quality criterion is fulfilled. 
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1 7. A method according to claim 16, wherein steps (b)-(d) are carried out for 

all examples of the validation set(s). 

18. A method according to any of the claims 1 5-1 7, wherein the local and/or 
global quality value is defined as functions of at least part of the column cells. 

1 9. A method according to any of the claims 1 5-1 8, wherein the adjustment 
iteration process is stopped if the quality criterion is not fulfilled after a given number 
of iterations. 



20. A method of classifying input data examples into at least one of a plural- 
ity of classes using a computer classification system configured according to any of the 
claims 1-19, whereby column cell values for each n-tuple or LUTand output score 
functions and/or decision rules are determined using on one or more training or valida- 

15 tion sets of input data examples, said method comprising 

a) applying an input data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n-tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of output 
score functions and decision rules thereby addressing specific rows in the set of 

20 n-tuples or LUTs, 

c) determining output score values as a function of the column vector cells and us- 
ing the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, and 

e) selecting the class or classes that win(s) according to the decision rules. 

25 

21 . A system for training a computer classification system which can be de- 
fined by a network comprising a stored number of n-tuples or Look Up Tables (LUTs), 
with each n-tuple or LUT comprising a number of rows corresponding to at least a sub- 
set of possible classes and further comprising a number of columns being addressed by 

30 signals or elements of sampled training input data examples, each column being de- 
fined by a vector having cells with values, said system comprising 

a) input means for receiving training input data examples of known classes, 

b) means for sampling the received input data examples and addressing column vec- 
tors in the stored set of n-tuples or LUTs, 
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c) means for addressing specific rows in the set of n-tuples or LUTs, said rows corre- 
sponding to a known class, 

d) storage means for storing determined n-tuples or LUTs, 

e) means for determining column vector cell values so as to comprise or point to in- 
5 formation based on the number of times the corresponding cell address is sampled 

from the training set(s) of input examples, and 

f) means for determining one or more output score functions and/or one or more 
decision rules, said output score functions and/or decision rules determining 
means being adapted to determine said functions and/or rules based on the infor- 

10 mation of at least part of the determined column vector cell values. 

22. A system according to claim 21, wherein the means for determining the 
output score functions is adapted to determine such functions from a family of output 
score functions determined by a set of parameter values. 

15 

23. A system according to claim 21 or 22, wherein the means for determin- 
ing the output score functions and/or the decision rules is adapted to determine such 
functions and/or rules based on a validation set of input data examples of known 
classes, said validation set preferably comprising at least part of the training set(s) used 

20 for determining the column cell values. 



24. A system according to any of the claims 21-23, wherein the means for 
determining the output score functions and decision rules comprises 

means for initialising one or more sets output score functions and/or decision rules, and 
25 means for adjusting output score functions and decision rules by use of at least part of 
the validation set of input examples. 

25. A system according to any of the claims 21-24, wherein the means for 
determining the column vector cell values is adapted to determine these values as a 

30 function of the number of times the corresponding cell address is sampled from the 
set(s) of training input examples. 

26. A system according to any of the claims 21-25, wherein, when a training 
input data example belonging to a known class is applied to the classification network 
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thereby addressing one or more column vectors, the means for determining the column 
vector cell values is adapted to increment the value or vote of the cells of the addressed 
column vector(s) corresponding to the row(s) of the known class, said value preferably 
being incremented by one. 

5 

27. A system according to any of the claims 24-26, wherein the means for 

adjusting output score functions and/or decision rules is adapted to 

determine a global quality value based on at least part of column vector cell 

values, 

10 determine if the global quality value fulfils a required global quality criterion, 

and 

adjust at least part of the output score functions and/or decision rules until the 
global quality criterion is fulfilled. 

15 28. A system according to any of the claims 24-27, wherein the means for 

adjusting output score functions and decision rules is adapted to 
a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the ad- 
dressed vector cell values, 
20 b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) repeat the local quality test for a predetermined number of training input exam- 
ples, 

25 e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
30 fulfilled. 



29. A system according to any of the claims 27 or 28, wherein the means for 

adjusting the output score functions and decision rules is further adapted to stop the 
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iteration process if the global quality criterion is not fulfilled after a given number of 
iterations. 

30. A system according to any of the claims 21-29, wherein the means for 
5 storing n-tuples or LUTs comprises means for storing adjusted output score functions 

and decision rules and separate means for storing best so far output score functions and 
decision rules or best so far classification system configuration values. 

31 . A system according to claim 30, wherein the means for adjusting the 

10 output score functions and decision rules is further adapted to replace previously sepa- 
rately stored best so far output score functions and decision rules with obtained ad- 
justed output score functions and decision rules if the determined global quality value 
is closer to fulfil the global quality criterion than the global quality value corresponding 
to previously separately stored best so far output score functions and decision rules. 

15 

32. A system for classifying input data examples of unknown classes into at 
least one of a plurality of classes, said system comprising: 

storage means for storing a number or set of n-tuples or Look Up Tables (LUTs) 
with each n-tuple or LUT comprising a number of rows corresponding to at 

20 least a subset of the number of possible classes and further comprising a num- 

ber of column vectors, each column vector being addressed by signals or ele- 
ments of a sampled input data example, and each column vector having cell 
values being determined during a training process based on one or more sets of 
training input data examples, 

25 storage means for storing one ore more output score functions and/or one or 

more decision rules, each output score function and/or decision rule being de- 
termined during a training or validation process based on one or more sets of 
validation input data examples, said system further comprising: 
input means for receiving an input data example to be classified, 

30 means for sampling the received input data example and addressing column 

vectors in the stored set of n-tuples or LUTs, 

means for addressing specific rows in the set of n-tuples or LUTs, said rows cor- 
responding to a specific class, 
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means for determining output score values using the stored output score func- 
tions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score val- 
ues and stored decision rules. 

5 

33. A system according to claim 32, wherein the cell values of the column 

vectors and the output score functions and/or decision rules of the classification system 
are determined by use of a training system according to any of the claims 21-31 . 

10 34. A system according to claim 32, wherein the column vector cell values 

and the output score functions and/or decision rules are determined during a training 
process according to any of the claims 1-19. 

15 
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N-TUPLE OR RAM BASED NEURAL NETWORK CLASSIFICATION SYSTEM 
AND METHOD 

i 

BACKGROUND OF THE INVENTION 



1 . Fidd of the Invention 

i The present invention relates generally to n-tuple or RAM based neural network 

, classification systems and, more particularly, to n-tuple or RAM based classification 

- systems where the decision criteria applied to obtain the output scores and compare 

! these output scores to obtain a classification are determined during a training process. 

2. Description of the Prior Art 

■ 

A known way of classifying objects or patterns represented by electric signals or binary 
codes and, more precisely, by vectors of signals applied to the inputs of neural network 
classification systems lies in the implementation of a so-called learning or training 
phase. This phase generally consists of the configuration of a classification network 
that fulfils a function of performing the envisaged classification as efficiently as 
i possible by using one or more sets of signals, called learning or training sets, where the 

] membership of each of these signals in one of the classes in which it is desired to 

i classify them is known. This method is known as supervised learning or learning with a 

I teacher, 
i 

A subclass of classification networks using supervised learning are networks using 
memory-based learning. Here, one of the oldest memory-based networks is the "n- 
tuple network" proposed by Bledsoe and Browning (Bledsoe, W.W. and Browning. I, 
1959, "Pattern recognition and reading by machine", Proceedings of the Eastern Joint 
Computer Conference, pp. 225-232) and more recently described by Morciniec and 
Rohwer (Morciniec, M. and Rohwcr, R.,1996, "A theoretical and experimental 
account of n-tuple classifier performance", Neural Comp., pp. 629-642). 

One of the benefits of such a memory-based system is a very fast computation time, 
both during the learning phase and during classification. For the known types of n- 
tuple networks, which is also known as "RAM networks" or "weightless neural 
networks", learning may be accomplished by recording features of patterns in a 
random-access memory (RAM), which requires just one presentation of the training 
set(s) to the system. 

The training procedure for a conventional RAM based neural network is described by 
J0rgensen (co-inventor of this invention) et al. in a contribution to a recent book on 
RAM based neural networks (T.M. Jergcnsen, S.S. Christensen, and C. Liisberg, 
"Cross-validation and information measures for RAM based neural networks," RAM- 
based neural networks, I. Austin, ed., World Scientific, London, pp. 78-88, 1998). The 
contribution describes how the RAM based neural network may be considered as 
comprising a number of Look Up Tables (LUTs). Each LUT may probe a subset of a 
binary input data vector. In the conventional scheme the bits to be used are selected at 
random. The sampled bit sequence is used to construct an address. This address 
corresponds to a specific entry (column) in the LUT. The number of rows in the LUT 
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corresponds Co the number of possible classes. For each class the output can take on 
the values 0 or 1 A value of 1 corresponds to a vote on that specific class. When 
performing a classification, an input vector is sampled, the output vectors from all 
LUTs are added, and subsequently a winner takes all decision is made to classify the 
input vector. In order to perform a simple training of the network, the output values 
may initially be set to 0. For each example in the training set, the following steps 
should then be carried out: 

Present the input vector and the target class to the network, 

for all LUTs calculate their corresponding column entries, and 

set the output value of the target class to 1 in all the "active" columns. 

By use of such a training strategy it may be guaranteed that each training pattern 
always obtains the maximum number of votes. As a result such a network makes no 
misclassification on the training set, but ambiguous decisions may occur. Here, the 
generalisation capability of the network is directly related to the number of input bits 
for each LUT. If a LUT samples all input bits then it will act as a pure memory device 
and no generalisation will be provided. As the number of input bits is reduced the 
generalisation is increased at an expense of an increasing number of ambiguous 
decisions. Furthermore, the classification and generalisation performances of a LUT 
are highly dependent on the actual subset of input bits probed. The purpose of an 
"intelligent" training procedure is thus to select the most appropriate subsets of input 
data. 

Jergensen et al. further describes what is named a "leave-one-out cross-validation test" 
which suggests a method for selecting an optimal number of input connections to use 
per LUT in order to obtain a low classification error rate with a short overall 
computation time. In order to perform such a cross-validation test it is necessary to 
obtain a knowledge of the actual number of training examples that have visited or 
addressed the cell or element corresponding to the addressed column and class. It is 
therefore suggested that these numbers are stored in the LUTs. It is also suggested by 
Jergensen et al. how the LUTs in the network can be selected in a more optimum way 
by successively training new sets of LUTs and performing cross validation test on each 
LUT. Thus, it is known to have a RAM network in which the LUTs are selected by 
presenting the training set to the system several times. 

The output vector from the RAM network contains a number of output scores, one for 
each possible class. As mentioned above a decision is normally made by classifying an 
example in to the class having the largest output score. This simple winner-takes-all 
(WTA) scheme assures that the true class of a training examples cannot lose to one of 
the other classes. One problem with the RAM net classification scheme is that it often 
behaves poorly when trained on a training set where the distribution of examples 
between the training classes are highly skewed. Accordingly there is a need for 
understanding the influence of the composition of the training material on the 
behaviour of the RAM classification system as well as a general understanding of the 
influence of specific parameters of the architecture on the performance. From such an 
understanding it could be possible to modify the classification scheme to improve its 
performance and competitiveness with other schemes. Such improvements of the RAM 
based classification systems is provided according to the present invention. 
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SUMMARY OF THE INVENTION 

Recently Thomas Martini Jergensen and Christian Linneberg (inventors of this 
invention) have provided a statistical framework that have made it possible to make a 
theoretical analysis that relates the expected output scores of the n-tuple net to the 
stochastic parameters of the example distributions, the number of available training 
examples, and the number of address lines n used for each LUT or n-tuple. From the 
obtained expressions, they have been able to study the behaviour of the architecture in 
different scenarios. Furthermore, they have based on the theoretical results come up 
with proposals for modifying the n-tuple classification scheme in order to make it 
operate as a close approximation to the maximum a posteriori or maximum likelihood 
estimator. The resulting modified decision criteria can for example deal with the so- 
called skewed class prior problem causing the n-tuple net to often behave poorly when 
trained on a training set where the distribution of examples between the training classes 
are highly skewed. Accordingly the proposed changes of the classification scheme 
provides an essential improvement of the architecture. The suggested changes in 
decision criteria are not only applicable to the original n-tuple architecture based on 
random memorisation. It also applies to extended n-tuple schemes, some of which use 
a more optimal selection of the address lines and some of which apply an extended 
weight scheme. 

According to a first aspect of the present invention there is provided a method for 
training a computer classification system which can be defined by a network 
comprising a number of n-tuples or Look Up Tables (LUTs), with each n-tuple or 
LUT comprising a number of rows corresponding to at least a subset of possible 
classes and further comprising a number of columns being addressed by signals or 
elements of sampled training input data examples, each column being defined by a 
vector having cells with values, said method comprising 

determining the column vector cell values based on one or more training sets of 
input data examples for different classes so that at least part of the cells comprise or 
point to information based on the number of times the corresponding cell address is 
sampled from one or more sets of training input examples. The method further 
comprises 

determining one or more output score functions for evaluation of at least one 
output score value per class, and/or 

determining one or more decision rules to be used in combination with at least 
part of the obtained output score values to determine a winning class. 

It is preferred that the output score values are evaluated or determined based on the 
information of at least part of the determined column vector cell values. 

According to the present invention it is preferred that the output score functions and/or 
the decision rules are determined based on the information of at least part of the 
determined column vector cell values. 

It is also preferred to determine the output score functions from a family of output 
score functions determined by a set of parameter values. Thus, the output score 
functions may be determined either from the set of parameter values, from the 
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information of at least part of the determined column vector eel! values or from both 
the set of parameter values and the information of at least part of the 
determined column vector cell values. 

It should be understood that the training procedure of the present invention may be 
considered a two step training procedure. The first step may comprise determining the 
column vector cell values, while the second step may comprise determining the output 
score functions and/or the decision rules. 

As already mentioned, the column vector cells are determined based on one or more 
training sets of input data examples of known classes, but the output score functions 
and/or the decision rules may be determined based on a validation set of input data 
examples of known classes. Here the validation set may be equal to or part of the 
training set(s), but the validation set may also be a set of examples not included in the 
training set(s). 

According to the present invention the training and/or validation input data examples 
may preferably be presented to the network as input signal vectors. 

It is preferred that determination of the output score functions is performed so as to 
allow different ways of using the contents of the column vector cells in calculating the 
output scores used to find the winning class amongst two or more classes. The way the 
contents of the column vector cells are used to obtain the score of one class might 
depend on which class(es) it is compared with. 

It is also preferred that the decision rules used when comparing two or more classes in 
the output space are allowed to deviate from the decision rules corresponding to a 
WTA decision. Changing the decision rules for choosing two or more classes is 
equivalent to allowing individual transformation of the class output scores and keeping 
a WTA comparison. These corresponding transformations might depend on which 
class(es) a given class is compared with. 

The determination of how the output score functions may be calculated from the 
column vector cell values, as well as the determination of how many output score 
functions to use and/or the determination of the decision rules to be applied on the 
output score values may comprise the initialisation of one or more sets of output score 
functions and/or decision rules. 

Furthermore it is preferred to adjust at least part of the output score functions and/or 
the decision rules based on an information measure evaluating the performance on the 
validation example set. If the validation set equals the training set or part of the 
training set it is preferred to use a leavc-onc-out cross-validation evaluation or 
extensions of this concept. 

In order to determine or adjust the output score functions and the decision rules 
according to the present invention, the column cell values should be determined. Here, 
it is preferred that at least part of the column cell values are determined as a function 
of the number of times the corresponding cell address is sampled from the set(s) of 
training input examples. Alternatively, the information of the column cells may be 
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determined so that the maximum column cell value is I, but at least part of the cells 
have an associated value being a function of the number of times the corresponding 
cell address is sampled from the training set(s) of input examples. Preferably, the 
column vector cell values are determined and stored in storing means before the 
determination or adjustment of the output score functions and/or the decision rules. 

According to the present invention, a preferred way of determining the column vector 
cell values may comprise the training steps of 

a) applying a training input data example of a known class to the 
classification network, thereby addressing one or more column vectors, 

b) incrementing, preferably by one, the value or vote of the cells of the 
addressed column vectors) corresponding to the row(s) of the known 
class, and 

c) repeating steps (a)-(b) until all training examples have been applied to the 
network. 

However, it should be understood that the present invention also covers embodiments 
where the information of the column cells is determined by alternative functions of the 
number of times the cell has been addressed by the input training sct(s) Thus, the cell 
information does not need to comprise a count of all the times the cell has been 
addressed, but may for example comprise an indication of when the ceil has been 
visited zero times, once, more than once, and/or twice and more than twice and so on. 

In order to determine the output score functions and/or the decision rules, it is 
preferred to adjust these output score functions and/or decision rules, which 
adjustment process may comprise one or more iteration steps. The adjustment of the 
output score functions and/or the decision rules may comprise the steps of 

determining a global quality value based on at least part of the column vector 

cell values, 

determining if the global quality value fijlfils a required quality criterion, and 
adjusting at least part of output score functions and/or part of the decision rules 
until the global quality criterion is fulfilled. 

The adjustment process may also include determination of a local quality value for 
each sampled validation input example, with one or more adjustments being performed 
if the local quality value does not fulfil a specified or required local quality criterion for 
the selected input example. As an example the adjustment of the output score functions 
and/or the decision rules may comprise the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation input 
example, the local quality value being a function of at least part of the 
addressed column ceD values, 

c) determining if the local quality value fulfils a required local quality criterion, if 
not, 

adjusting one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) selecting a new input example frorn a predetermined number of examples of the 
validation set(s). 
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c) repeating the locai quality test steps (bM<0 for aU the predetermined validation 
input examples, 

0 determining a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality criterion, 
and, 

h) repeating steps (aXg) until the global quality criterion is fulfilled. 

Preferably, steps (b)-(d) of the above mentioned adjustment process may be carried out 
for all examples of the validation set(s) 

The local and/or global quality value may be defined as functions of at least part of the 
column cells. 

It should be understood that when adjusting the output score functions and/or decision 
rules by use of one or more quality values each with a corresponding quality criterion, 
it may be preferred to stop the adjustment iteration process if a quality criterion is not 
fulfilled after a given number of iterations. 

It should also be understood that during the adjustment process the adjusted output 
score functions and/or decision rules are preferably stored after each adjustment, and 
when the adjustment process includes the determination of a global quality value, the 
step of determination of the global quality value may further be followed by separately 
storing the hereby obtained output score functions and/or decision rules or 
classification system configuration values if the determined global quality value is 
closer to fulfil the global quality criterion than the global quality value corresponding 
to previously separately stored output score functions and/or decision rules or 
configuration values. 

A main reason for training a classification system according to an embodiment of the 
present invention is to obtain a high confidence in a subsequent classification process 
of an input example of an unknown class. 

Thus, according to a further aspect of the present invention, there is also provided a 
method of classifying input data examples into at least one of a plurality of classes 
using a computer classification system configured according to any of the above 
described methods of the present invention, whereby column cell values for each n- 
tuple or LUT and output score functions and/or decision rules are determined using on 
one or more training or validation sets of input data examples, said method comprising 

a) applying an input data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n- tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of output 
score functions and decision rules thereby addressing specific rows in the set of 
n- tuples or LUTs, 

c) determining output score values as a function of the column vector cells and 
using the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, and 

e) selecting the class or classes that win(s) according to the decision rules. 
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The present invention also provides training and classification systems according to the 
above described methods of training and classification. 

Thus, according to the present invention there is provided a system for training a 
computer classification system which can be defined by a network comprising a stored 
number of n-tuples or Look Up Tables (LUTs), with each n-tuple or LUT comprising 
a number of rows corresponding to at least a subset of possible classes and further 
comprising a number of columns being addressed by signals or elements of sampled 
training input data examples, each column being defined by a vector having cells with 
values, said system comprising 

• input means for receiving training input data examples of known classes, 

• means for sampling the received input data examples and addressing column 
vectors in the stored set of n-tuples or LUTs, 

• means for addressing specific rows in the set of n-tuples or LUTs, said rows 
corresponding to a known class, 

• storage means for storing determined n-tuples or LUTs, 

• means for determining column vector cell values so as to comprise or point to 
information based on the number of times the corresponding celt address is 
sampled from the training set(s) of input examples, and 

« means for determining one or more output score Amotions and/or one or more 
decision rules. 

Here, it is preferred that the means for determining the output score functions and/or 
decision rules is adapted to determine these functions and/or rules based on the 
information of at least part of the determined column vector cell values. 

The means for determining the output score functions may be adapted to determine 
such functions from a family of output score functions determined by a set of 
parameter values. Thus, the means for determining the output score functions may be 
adapted to determine such functions either from the set of parameter values, from the 
information of at least pan of the determined column vector cell values or from both 
the set of parameter values and the information of at least part of the 
determined column vector cell values. 

According to the present invention the means for determining the output score 
functions and/or the decision rules may be adapted to determine such functions and/or 
rules based on a validation set of input data examples of known classes. Here the 
validation set may be equal to or part of the training set(s) used for determining the 
column cell values, but the validation set may also be a set of examples not included in 
the training set(s). 

In order to determine the output score functions and decision rules according to a 
preferred embodiment of the present invention, the means for determining the output 
score functions and decision rules may comprise 

means for initialising one or more sets output score functions and/or decision 

rules, and 

means for adjusting output score functions and decision rules by use of at least 
part of the validation set of input examples. 
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As already discussed above the column cell values should be determined in order to 
determine the output score functions and decision rules. Here, it is preferred that the 
means for determining the column vector cell values is adapted to determine these 
values as a function of the number of times the corresponding cell address is sampled 
from the set(s) of training input examples. Alternatively, the means for determining the 
column vector cell values may be adapted to determine these cell values so that the 
! maximum value is 1, but at least part of the cells have an associated value being a 

i function of the number of times the corresponding cell address is sampled from the 

training set(s) of input examples. 

1 According to an embodiment of the present invention it is preferred that when a 

training input data example belonging to a known class is applied to the classification 
network thereby addressing one or more column vectors, the means for determining 

! the column vector cell values is adapted to increment the value or vote of the cells of 

the addressed column vectors) corresponding to the row(s) of the known class, said 
value preferably being incremented by one. 

, For the adjustment process of the output score functions and decision rules it is 

preferred that the means for adjusting output score functions and/or decision rules is 
adapted to 

f determine a global quality value based on at least part of column vector cell 

values, 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust at least part of the output score functions and/or decision rules until the 
global quality criterion is fulfilled. 

As an example of a preferred embodiment according to the present invention, the 
means for adjusting output score functions and decision rules may be adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the 
addressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) repeat the local quality test for a predetermined number of training input 
examples, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

0 determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 
fulfilled. 

The means for adjusting the output score functions and decision rules may further be 
adapted to stop the iteration process if the global quality criterion is not fulfilled after a 
given number of iterations. In a preferred embodiment, the means for storing n-tuples 
or LUTs comprises means for storing adjusted output score functions and decision 
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niles and separate means for storing best so far output score functions and decision 
rules or best so far classification system configuration values. Here, the means for 
adjusting the output score functions and decision rules may further be adapted to 
replace previously separately stored best so far output score functions and decision 
rules with obtained adjusted output score functions and decision rules if the determined 
global quality value is closer to fulfil the global quality criterion than the global quality 
value corresponding to previously separately stored best so far output score functions 
and decision rules. Thus, even if the system should not be able to fulfil the global 
quality criterion within a given number of iterations, the system may always comprise 
the "best so far** system configuration. 

According to a further aspect of the present invention there is also provided a system 
for classifying input data examples of unknown classes into at least one of a plurality of 
classes, said system comprising: 

storage means for storing a. number or set of n-tuples or Look Up Tables 
(LUTs) with each n-tuple or LUT comprising a number of rows corresponding 
to at least a subset of the number of possible classes and further comprising a 
number of column vectors, each column vector being addressed by signals or 
elements of a sampled input data example, and each column vector having cell 
values being determined during a training process based on one or more sets of 
training input data examples, 

storage means for storing one ore more output score functions and/or one or 

more decision rules, each output score function and/or decision rule being 

determined during a training or validation process based on one or more sets of 

validation input data examples, said system further comprising: 

input means for receiving an input data example to be classified, 

means for sampling the received input data example and addressing column 

vectors in the stored set of n-tuples or LUTs, 

means for addressing specific rows in the set of n-tuples or LUTs, said rows 
corresponding to a specific class, 

means for determining output score values using the stored output score 
functions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score 
values and stored decision rules. 

It should be understood that it is preferred that the cell values of the column vectors 
and the output score functions and/or decision rules of the classification system 
according to the present invention are determined by use of a training system 
according to any of the above described Systems. Accordingly, the column vector cell 
values and the output score functions and/or decision rules may be determined during a 
training process according to any of the above described methods. 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding of the present invention and in order to show how the same 
may be carried into effect, reference will now be made by way of example to the 
accompanying drawings in which: 
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Fig. 1 shows a block diagram of a RAM classification network with Look Up Tables 
(LUTs), 

Fig. 2 shows a detailed block diagram of a single Look Up Table (LUT) according to 
an embodiment of the present invention, 



Fig. 3 shows a block diagram of a computer classification system according to the 
present invention, 

Fig. 4 shows a flow chart of a learning process for LUT column cells according to an 
embodiment of the present invention, 

Fig. 5 shows a flow chart of a learning process according to a embodiment of the 
present invention, 

Fig. 6 shows a flow chart of a classification process according to the present invention. 
DETAILED DESCRIPTION OF THE INVENTION 

In the following a more detailed description of the architecture and concept of a 
classification system according to the present invention will be given including an 
example of a training process of the column cells of the architecture and an example of 
a classification process. Furthermore, different examples of learning processes for the 
output score functions and the decision rules according to embodiments of the present 
invention are described. 



The notation used in the following description and examples is as follows: 

X: The training set. 

x: An example from the training set. 

N x : Number of examples in the training set X . 

Xf The j*th example from a given ordering of the training set X. 

p: A specific example (possible outside the training set). 

C: Class label. 

C(x): Class label corresponding to example x (the true class). 

C„ : Winner Class obtained by classification. 

C r : True class obtained by classification. 

N c : Number of training classes corresponding to the maximum number of rows in a 



Q: Set of LUTs (each LUT may contain only a subset of all possible address 

columns, and the different columns may register only subsets of the existing 
classes). 

H LUT : Number of LUTs. 

Ncol : Number of different columns that can be addressed in a specific LUT (LUT j 
dependent). 

-V c : The set of training examples labelled class C. t 



Notation 



LUT. 
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v^: Entry counter for the cell addressed by the i'th column and the C'th class. 
a i(5>) : Index of ihe column in the i'th LUT being addressed by example y . 

Vector containing all elements of the LUT network. 
Q L : Local quality function. 
Qo • Giobal quality function. 
B qc ': Decision rule matrix 
A/ n .r, : Cost matrix 
S(): Score function 

r(): Leave-one-out cross-validation score function 

P: Path matrix 

0: Parameter vector 

£; Set of decision rules 

rf c : Score value on class c 

D(-) : Decision function 

Description of architecture and concept 

Jn the following references are made to Fig. 1, which shows a block diagram of a 
RAM classification network with Look Up Tables (LUTs), and Fig. 2, which shows a 
detailed block diagram of a single Look Up Table (LUT) according to an embodiment 
of the present invention. 

A RAM-net or LUT-net consists of a number of Look Up Tables (LUTs) (1.3). Let 
the number of LUTs be denoted . An example of an input data vector J to be 
classified may be presented to an input module (1.1) of the LUT network. Each LUT 
may sample a part of the input data, where different numbers of input signals may be 
sampled for different LUTs (1 .2) (in principle it is also possible to have one LUT 
sampling the whole input space). The outputs of the LUTs may be fed ( 1 .4) to an 
output module ( 1 .5) of the RAM classification network. 

In Fig. 2 it is shown that for each LUT the sampled input data (2. 1) of the example 
presented to the LUT-net may be fed into an address selecting module (2.2). The 
address selecting module (2.2) may from the input data calculate the address of one or 
more specific columns (2.3) in the LUT. As an example, let the index of the column in 
the i*th LUT being addressed by an input example.? be calculated as a,(y) . The 
number of addressable columns in a specific LUT may be denoted , and varies in 
general from one LUT to another. The information stored in a specific row of a LUT 
may correspond to a specific class C (2 4) The maximum number of rows may then 
correspond to the number of classes, N c . The number of cells within a column 
corresponds to the number of rows within the LUT. The column vector cells may 
correspond to class specific entry counters of the column in question. The entry 
counter value for the cell addressed by the i'th column and class C is denoted (2.5). 

The -values of the activated LUT columns (2.6) may be fed (1.4) to the output 
module (1.5), where one or more output scores may be calculated for each class and 
where these output scores in combinations with a number of decision rules determine 
the winning class. 
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Let x g X denote an input data example used for training and let y denote an input data 
example not belonging to the training set. Let C(x) denote the class to which r 
belongs. The class assignment given to the example p is then obtained by calculating 
one or more output scores for each class. The output scores obtained for class C is 
calculated as functions of the number* addressed by the example J but will in 
general also depend on a number of parameters p Let the m* output score of class C 
be denoted S Ctm (vic.7fy- A classification is obtained by combining the obtained output 
scores from all classes with a number of decision rules. The effect of the decision rules 
is to define regions in the output score space that must be addressed by the output 
score values to obtain a given winner class. The set of decision rules is denoted z and 
corresponds to a set of decision borders. 

Figure 3 shows an example of a block diagram of a computer classification system 
according to the present invention. Here a source such as a video camera or a database 
provides an input data signal or signals (3.0) describing the example to be classified. 
These data are fed to a pre-processing module p. 1) of a type which can extract 
features, reduce, and transform the input data in a predetermined manner. An example 
of such a pre-processing module is a FFT-board (Fast Fourier Transform). The 
transformed data are then fed to a classification unit (3 2) comprising a RAM network 
according to the present invention. The classification unit (3 .2) outputs a ranked 
classification list which might have associated confidences. The classification unit can 
be implemented by using software to programme a standard Personal Computer or 
programming a hardware device, e.g. using programmable gate arrays combined with 
RAM circuits and a digital signal processor. These data can be interpreted in a post- 
processing device (3.3), which could be a computer module combining the obtained 
classifications with other relevant information. Finally the result of this interpretation is 
fed to an output device (3.4) such as an actuator. 

Initial training of the a rchitecture 

The flow chart of Fig. 4 illustrates a one pass learning scheme or process for the 
determination of the column vector entry counter or cell distribution, v^-distribution 
(4.0), according to an embodiment of the present invention, which may be described as 
follows: 

1 . Initialise all entry counters or column vector cells by setting the cell values, v, 
to zero (4.1). 

2. Present the first training input example, JF, from the training set to the 
network (4.2, 4.3). 

3 Calculate the columns addressed for the first LUT (4.4, 4 . 5). 

4. Add 1 to the entry counters in the rows of the addressed columns that 

correspond to the class label of x (increment v MnctJ) in all LUTs) (4.6). 
5 Repeat step 4 for the remaining LUTs (4.7, 4.8). 

6. Repeat steps 3-5 for the remaining training input examples (4.9, 4.10). The 
number of training examples is denoted N x . 
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Initialisation of output score functi ons and decision 

Before the trained network can be used for classification the output score functions 
and the decision rules must be initialised. 



Classification of an unknown inpu t example 

When the RAM network of the present invention has been trained to thereby determine 
values for the column cells whereby the LUTs may be defined, the network may be 
used for classifying an unknown input data example. 

In a preferred example according to the present invention, the classification is 
performed by using the decision rules H and the output scores obtained from the 
output score functions Let the decision function invoking 2 and the output scores be 
denoted D(-). The winning class can then be written as: 



Figure 6 shows a block diagram of the operation of a computer classification system in 
which a classification process (6 0) is performed. The system acquires one or more 
input signals (6. 1) using e.g. an optical sensor system. The obtained input data are pre- 
processed (6.2) in a pre-processing module, e.g. a low-pass filter, and presented to a 
classification module (6.3) which according to an embodiment of the invention may be 
a LUT-network. The output data from the classification module is then post-processed 
in a post-processing module (6.4), e.g. a CRC algorithm calculating a cyclic 
redundancy check sum, and the result is forwarded to an output device (6.5), which 
could be a monitor screen. 

Adjustment of output score function parameter 6 and adjustment of decision rules H 

Usually the initially determined values otfl and the initial set of rules 3 will not present 
the optimal choices. Thus, according to a preferred embodiment of the present 
invention, an optimisation or adjustment of the fl values and the E rules should be 
performed. 

In order to select or adjust the parameters fl and the rules H to improve the 
performance of the classification system, it is suggested according to an embodiment 
of the invention to define proper quality functions for measuring the performance of 
the p- values and the E- rules. Thus, a local quality function Q L <yJt % A*, AS) may be 
defined, where v denotes a vector containing all elements of the LUT network. 
The local quality function may give a confidence measure of the output classification of 
a specific example x If the quality value does not satisfy a given criterion the 0 values 
and the 2 rules are adjusted to make the quality value satisfy or closer to satisfying the 
criterion (if possible). 

Furthermore a global quality function: fi;(v,A\AS) may be defined. The global quality 
function may measure the performance of the input training set as a whole. 
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Fig. 5 shows a flow chart for adjustment or learning of the fi values and the H rules 
according to the present invention. 

Example 1 

This example illustrates an optimisation procedure for adjusting the decision rules S. 
For each class c we define a single output score function: 

- ZA<Mv M ,>.,), l« (A> A.-) 
where £ l%> is Kroneckers delta (S f j « I if /' = y and 0 otherwise), and 

Here k is a parameter which may be set to any desired value. In this example it is preferred to set k to 
a value of one. 

The expression foMhc output score function illustrates a possible family of functions determined by a 
parameter vector ft. This example, however, will only illustrate a procedure for adjusting the decision 
mlcs E, and not ft. For simplicity of notation we therefore Initialise all values in ft to pne. We then 
have: 

scO 

Thus, for this example, the above output score function itsclfis not a Ainction of the column vector 
cell values, but the column vector cell values arc used as input parameter in order to calculate the 
output score values. 

The leave-onc-out cross-validation vote-count on a given class c is. 
tma 

where C T denotes the true class of example x . 

For all possible inter-class combinations, (Cj.Cj) we wish to determine a suitable decision border in 
the score space spanned by the two classes. The matrix B*' rs is defined to contain the decisions 
corresponding to a given set of decision rules applied to the two corresponding output score values; i.e 
whether class c, or class c t wins. The tow and column dimensions are given by the allowed ranges of 
the two output score values, possibly after a discretisation procedure. The matrix elements contains 
one of the following three values: c^Cj and where k Aka is a constant different from c,and 
Cj. The two output score values 5, and S t obtained on class c, and class e 2 , respectively, address the 
element b*[* in the matrix B'*-*>. If the addressed clement contains the value c, it means that class 
e, wins over class cj . If the addressed element contains the value Cj it means that class cj wins over 
class c, . Finally, if the addressed element contains the value k Aha . it means the decision is 
ambiguous. 

The decision rules are initialised to correspond to a WTA decision. This corresponds to having a 
decision border along the diagonal in the matrix B*** 1 * . Along the diagonal the dements are 
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initialised to take on the value Above and respectively below the diagonal the element* arc 

labelled with opposite class values. 

A strategy for adjusting the initialised decision border according to an information measure thai uses 
the F Mnr values is outlined below. 

create the cost matrix M r(C , with elements given as: 
^> te Zfo(S)*#Ai\(x)*y)* 

a ^denotes the cost associated with classifying an example from class c % In to class c, and 
a «k.* cfcnotcs the cost associated with the opposite error. It is here assumed that a logical true 
evaluates to one and a logical false evaluates to zero. 

A minimal-cost path from m, , to m f/ ur ^,ff UJrn can be calculated using e.g. a dynamic 
programrning approach {Q c measures (he total cost along a given path}: 

Loop through all entries in the cost matrix in reverse order: 




For each entry, calculate the lowest associated total-costs given as 
and note the corresponding direction in the path-matrix P: 

(Indexes outside the matrix are considered as containing the value *>) 

According to the dynamic programming approach the path with the smallest associated total- 
cost is now obtained by traversing the P-matri.x in the following manner to obtain the decision 
border in the score space spanned by the classes in question. (A,?/ 1 denotes an clement in a 
matrix with dimensions given by the maximum number of possible votes on class c, times the 
maximum number of possible votes on .) 



Set 1=1 and j-1 

Set h?-** b o to indicate an ambiguous decision 

Set b?f* m c, for aU /'> / and set o*/» = e 2 for an f> j to indicate the winning classes. 
Update i and j according to the value of p tJ ; 

Repeat 2-4 until the lover left corner is reached. 
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The dynamic programming approach can be extended with rcgubrisation terms, which 
constraint the shape of the border. 

An alternative method for determining the decision border could be to fit a B-spiine with two 
control points in such a way that the associated cost is minimised. 

Using the decision borders determined from the strategy outlined above an example can now be 
classified in the following manner: 

• Present the example to the network in order to obtain the score values or vote numbers 

• Define a new set of score values d c for all daises and initialise the scores to zero: d € = 0, 

• Loop through all possible inter-da st combinations, (c, , Cj ) . and update the vote-values: 

• The winning class can now be found as argma\(</ r ) 



A leavc-occ-out cross-validation test using the decision borders determined from the strategy outlined 
above is obtained in the following manner 

• Present the example to the network in order to obtain the leave -onc-out score values or vote 
numbers 

• r c(*) = Z e *-*e*n>..cnc) 

tfcO 

• Define a new set of score values d ( for all classes and initialise the scores to zero: <* c = 0, 

• Loop through all possible inter-class combinations, (c, ,c 2 ) , and update the vote-values: 

% u t n % (> tr n it) 

• The winning class can now be found as irgmix(rf f ) 

r 

With reference to Figure 5 the above adjustment procedure for the decision rules (borders) £ may be 
described as 

• Initialise the system by setting all values of ft to one, selecting a WTA scheme cm a two by two 
basis and by training the n-tuple classifier according to the flow chart in Fig, 4. (S.O) 

« Batch mode optimisation is chosen. (5. 1 ) 

• Test all examples by performing a leave-one -out classification as outline above (5. 1 2) and 
calculate the obtained leave-one-out cross-validation error rate and use it as the 0 a -mesurc, 
(5.13) 

• Store the values of 0 and the corresponding Q Q -value as well as the r.«rules. (5. 14) 

• If the Q G -value does not satisfy a given criterion or another stop criterion Is met then adjust the 
E-mtes according to the dynamic programming approach outline above. (5.16, 5.15) 

• If the Q a -value is satisfied or another stop criterion is met then select the combination with the 
lowest total error-rate. (5.17) 

In the above case one would as alternative stop criterion use a criterion that only allows two loops 
through the adjustment scheme. 

Example 2 

This example illustrates an optimisation procedure for adjusting ~0. 
For each class we again define a single output score 
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and In this example we use ? - (*V«W,**,).We also initialise the H rules 10 describe a WTA 
decision when comparing the output scores from the different classes. 

• Initialise the system by selling all k c -values to one, selecting a WTA scheme and by training the 
n-luple classifier according to the flow chart in Fig. 4. (5.0) 

• Batch mode optimisation is chosen. (5. 1) 

• Test all examples using a leave-onc-out cross-validation test (5. 12) and calculate the obtained 
leave -one -out cross-validation error rate used as Qq. (5. 13) 

• Store the values of 0 and the corresponding Qq value. (5. 14) 

• LoopthJwghaUfwssi^ k^, is;*^, ....Jr^ 5**^.(5.16. 

5.15) 

• Select the combination with the lowest total error-raie. (5. 1 7) 

For practical use, the * >Mr -value will depend upon the skewness or the class priors and the number or 
address-lines used in the RAM net system. 

Example 3 

This example also illustrates an optimisation procedure for adjusting J? but with the use or a local 
quality function Q L . 

For each class wo now define as many output scores as there are competing classes: J.c. A' f - 1 output 
Scores. 

s.,.„ (v. lW , y ,fi) - xe, Km4 * j. 

and in this example we use 

P ~ **«^ ** - * *%.c, .c^., ) . 

Wc also initialise the £ rules to describe a WTA decision when comparing the output scores from the 
different classes. 

• Initialise the system by setting all -values to say two. selecting a WTA scheme and by 
training the n-iuple classifier according to the flow chart in Fig. 4. (5.0) 

* On line mode as opposed to botch mode optimisation is chosen, (5.1) 

• Loop through all examples in the training set (5.2. 5.7. and 5.8) 

• Test each example to obtain the winner class C w in a leave-one-crossvalidation. Let the Q L - 
measurc compare C w with the true class C T . (5.3,5.4) 

♦ If C w * O a leave-onc-oul error is nude so the values of Jr^ Cf and k tr€t are adjusted by 
incrementing k^ <r with a small value, say 0. 1, and by decrementing k €f Cw with a small value, 
say 0.05. If the adjustment will bring the values below one, no adjustment is performed. (5.5,5.6) 

* When all examples have been processed the global information measure Q 0 (e g the leave-onc- 
out<rroi-rate) is calculated and the values of ft and Q c are stored. (5.9,5 JO) 

* If Qg or another stop criterion is not fulfilled the above mop is repeated (3 11) 

• If Go is satisfied or another stop criterion is fulfilled the best value of the stored Q c -values axe 
chosen together with the corresponding parameter values 0 and decision rules H. (5. 17,5. 1 8) 
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The foregoing description of preferred exemplary embodiments of the invention has 
been presented for the purpose of illustration and description. It is not intended to be 
exhaustive or to limit the invention to the precise form disclosed, and obviously many 
modifications and variations are possible in light of the present invention to those 
skilled in the art. All such modifications which retain the basic underlying principles 
disclosed and claimed herein are within the scope of this invention. 
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CLAIMS 

I. A method of training a computer classification system which can be defined 
by a network comprising a number of n-tuples or Look Up Tables (LUTs), with each 
n-tuple or LOT comprising a number of rows corresponding to at least a subset of 
possible classes and further comprising a number of columns being addressed by 
signals or elements of sampled training input data examples, each column being 
defined by a vector having cells with values, said method comprising 

determining the column vector cell values based on one or more training sets 
of input data examples for different classes so that at least part of the cells comprise or 
point to information based on the number of times the corresponding cell address is 
sampled from one or more sets of training input examples, and 

determining one or more output score functions for evaluation of at least one 
output score value per class, and/or 

determining one or more decision rules to be used in combination with at least 
part of the obtained output scores to determine a winning class, 

said output score functions and/or decision rules being determined based on 
the information of at least part of the determined column vector cell values. 

2. A method according to claim 1, wherein the output score functions and/or the 
decision rules arc determined based on a validation set of input data examples. 

3. A method according to claim 2, wherein the validation set comprises at least 
part of the training sct(s) of input data examples. 

4. A method according to any of the claims 1-3, wherein the output score 
functions are determined by a set of parameter values. 

5. A method according to any of the claims 1-4, wherein determination of the 
output score functions and/or the decision rules is based on an information measure 
evaluating the performance on the validation example set, said evaluating measure 
preferably being a leave-one-out cross validation test. 

6. A method according to any of the claims 1 -5 t wherein an output score space is 
given by the output score variable containing the output score values, and 

the decision rules define regions in the output score space to be addressed by 
obtained output score values to obtain a winning class. 

7. A method according to any of the claims 1-6, wherein determination of the 
output score functions and/or the decision rules comprises initialising the output score 
functions and/or the decision rules. 

8. A method according to claim 7, wherein the initialisation of the output score 
functions comprises determining a number of set-up parameters. 

9. A method according to claims 7 or 8, wherein the initialisation of the output 
score functions comprises setting all output score functions to a pre-detcimined 
mapping function. 
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10. ^ A method according to any of the claims 7-9, wherein the initialisation of the 
decision rules comprises setting the rules to a pre-determlned decision scheme. 

11. A method according to any of the claims 1-10, further comprising adjusting 
the output score functions and/or the decision rules, said adjustment preferably being 
based on an information measure evaluation. 

12. A method according to claim 1 1, wherein said information measure evaluation 
is a leave-one-out cross validation test. 

13. A method according to claim 8 and any of the claims 11-12, wherein the 
adjustment comprises changing the values of the set-up parameters. 

14. A method according to any of the claims 1-13, wherein the determination of 
the column vector cell values comprises the training steps of 

a) applying a training input data example of a known class to the 
classification network, thereby addressing one or more column vectors, 

b) incrementing, preferably by one, the value or vote of the cells of the 
addressed column vectors) corresponding to the row(s) of the known 
class, and 

c) repeating steps (a)-<b) until all training examples have been applied to 
the network. 



A method according to any of the claims 1 1-14. wherein the adjustment 
s comprises the steps of 
determining a global quality value based on at least part of the column vector 
cell values, 

determining if the global quality value fulfils a required quality criterion, and 
adjusting at least part of output score functions and/or pan of the decision 
rules until the global quality criterion is fulfilled. 

1 6. A method according to claim any of the clai ms 1 1 - 1 5, wherein the adjustment 
process comprises the steps of 

a) selecting an input example from the validation set(s), 

b) determining a local quality value corresponding to the sampled validation 
input example, the local quality value being a function of at least part of the 
addressed column cell values, 

c) determining if the local quality value fulfils a required local quality criterion, 
if not, 

adjusting one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) selecting a new input example from a predetermined number of examples of 
the validation set(s), 

e) repeating the local quality test steps (b)-(d) for all the predetermined validation 
input examples, 

0 determining a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

g) determining if the global quality value fulfils a required global quality 
criterion, and, 

h) repeating steps (a>(g) until the global quality criterion is fulfilled. 
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1 7. A method according to daim 16. wherein steps (b)-(d) are carried out for all 
examples of the validation set(s). 

18. A method according to any of the claims 15-17, wherein the local and/or 
global quality value is defined as functions of at least part of the column cells. 

1 9. A method according to any of the claims 1 5-18, wherein the adjustment 
iteration process is stopped if the quality criterion is not fulfilled after a given number 
of iterations. 



20. A method of classifying input data examples into at least one of a plurality of 
classes using a computer classification system configured according to any of the 
claims 1-19, whereby column cell values for each n-mple or LUT and output score 
functions and/or decision rules are determined using on one or more training or 
validation sets of input data examples, said method comprising 

a ) &PPly' n B an }n P ut data example to be classified to the configured classification 
network thereby addressing column vectors in the set of n-tuples or LUTs, 

b) selecting a set of classes which are to be compared using a given set of output 
score functions and decision rules thereby addressing specific rows in the set of 
n-tuples or LUTs, 

c) determining output score values as a function of the column vector cells and 
using the determined output score functions, 

d) comparing the calculated output values using the determined decision rules, and 

e) selecting the class or classes that win(s) according to the decision rules. 

21 A system for training a computer classification system which can be defined 
by a network comprising a stored number of n-tuples or Look Up Tables (LUTs), with 
each n-tuple or LUT comprising a number of rows corresponding to at least a subset 
of possible classes and further comprising a number of columns being addressed by 
signals or elements of sampled training input data examples, each column being 
defined by a vector having cells with values, said system comprising 

a) input means for receiving training input data examples of known classes, 

b) means for sampling the received input data examples and addressing column 
vectors in the stored set of n-tuples or LUTs, 

c) means for addressing specific rows in the set of n-tuples or LUTs, said rows 
corresponding to a known class, 

d) storage means for storing determined n-tuples or LUTs, 

e) means for determining column vector cell values so as to comprise or point to 
information based on the number of times the corresponding cell address is 
sampled from the training sct(s) of input examples, and 

f) means for determining one or more output score functions and/or one or more 
decision rules, said output score functions and/or decision rules determining means 
being adapted to determine said functions and/or rules based on the information of at 
least part of the determined column vector cell values. 

22. A system according to claim 21, wherein the means for determining the output 
score functions is adapted to determine such functions from a family of output score 
functions determined by a set of parameter values. 
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23. A system according to claim 2 1 or 22, wherein the means for determining the 
output score functions and/or the decision rules is adapted to determine such functions 
and/or rules based on a validation set of input data examples of known classes, said 
validation set preferably comprising at least part of the training sct(s) used for' 
determining the column cell values. 

24. A system according to any of the claims 21-23, wherein the means for 
determining the output score functions and decision rules comprises 

means for initialising one or more sets output score functions and/or decision rules, 
and 

means for adjusting output score functions and decision rules by use of at least part of 
the validation set of input examples. 

25. A system according to any of the claims 21-24, wherein the means for 
determining the column vector cell values is adapted to determine these values as a 
function of the number of times the corresponding cell address is sampled from the 
set(s) of training input examples. 

26. A system according to any of the claims 21-25, wherein, when a training input 
data example belonging to a known class is applied to the classification network 
thereby addressing one or more column vectors, the means for determining the 
column vector cell values is adapted to increment the value or vote of the cells of the 
addressed column vector(s) corresponding to the row(s) of the known class, said value 
preferably being incremented by one. 

27. A system according to any of the claims 24-26, wherein the means for 
adjusting output score functions and/or decision rules is adapted to 

determine a global quality value based on at least part of column vector cell 
values, 

determine if the global quality value fulfils a required global quality criterion, 
and 

adjust at least part of the output score functions and/or decision rules until the 
global quality criterion is fulfilled. 

28. A system according to any of the claims 24*27, wherein the means for 
adjusting output score functions and decision rules is adapted to 

a) determine a local quality value corresponding to a sampled validation input 
example, the local quality value being a function of at least part of the 
addressed vector cell values, 

b) determine if the local quality value fulfils a required local quality criterion, 

c) adjust one or more of the output score functions and/or decision rules if the 
local quality criterion is not fulfilled, 

d) repeat the local quality test for a predetermined number of training input 
examples, 

e) determine a global quality value based on at least part of the column vectors 
being addressed during the local quality test, 

f) determine if the global quality value fulfils a required global quality criterion, 
and, 

g) repeat the local and the global quality test until the global quality criterion is 



fulfilled. 
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29 : A s y stcm according to any of the claims 27 or 28, wherein the means for 
adjusting the output score functions and decision rules is further adapted to stop the 
iteration process if the global quality criterion is not fulfilled after a given number of 
iterations. 

30. A system according to any of the claims 21-29, wherein the means for storing 
n-tuples or LUTs comprises means for storing adjusted output score functions and 
decision rules and separate means for storing best so far output score functions and 
decision rules or best so far classification system configuration values. 

31. A system according to claim 30, wherein the means for adjusting the output 
score functions and decision rules is further adapted to replace previously separately 
stored best so far output score functions and decision rules with obtained adjusted 
output score functions and decision rules if the determined global quality value is 
closer to fulfil the global quality criterion than the global quality value corresponding 
to previously separately stored best so far output score functions and decision rules. 



32. A system for classifying input data examples of unknown classes into at least 
one of a plurality of classes, said system comprising: 

storage means for storing a number or set of n-tuples or Look Up Tables 
(LUTs) with each n-tuple or LUT comprising a number of rows corresponding 
to at least a subset of the number of possible classes and further comprising a 
number of column vectors, each column vector being addressed by signals or 
elements of a sampled input data example, and each column vector having cell 
values being determined during a training process based on one or more sets of 
training input data examples, 

storage means for storing one ore more output score functions and/or one or 

more decision rules, each output score function and/or decision rule being 

determined during a training or validation process based on one or more sets 

of validation input data examples, said system further comprising: 

input means for receiving an input data example to be classified, 

means for sampling the received input data example and addressing column 

vectors in the stored set of n-tuples or LUTs, 

means for addressing specific rows in the set of n-tuples or LUTs, said rows 
corresponding to a specific class, 

means for determining output score values using the stored output score 
functions and at least part of the stored column vector values, and 
means for determining a winning class or classes based on the output score 
values and stored decision rules. 

33. A system according to claim 32, wherein the cell values of the column vectors 
and the output score functions and/or decision rules of the classification system are 
determined by use of a training system according to any of the claims 21-31. 

34. A system according to claim 32, wherein the column vector cell values and the 
output score functions and/or decision rules are determined during a training process 
according to any of the claims 1-19. 
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